CN112738010B

CN112738010B - Data interaction method and system, interaction terminal and readable storage medium

Info

Publication number: CN112738010B
Application number: CN201911033625.7A
Authority: CN
Inventors: 盛骁杰
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-10-28
Filing date: 2019-10-28
Publication date: 2023-08-22
Anticipated expiration: 2039-10-28
Also published as: CN112738010A; WO2021083176A1

Abstract

The data interaction method and system, the interaction terminal and the readable storage medium, wherein the data interaction method comprises the following steps: acquiring a data stream to be played from a play control device in real time and playing and displaying the data stream in real time, wherein the data stream to be played comprises video data and interactive identifications, and each interactive identification is associated with a designated frame time of the data stream to be played; responding to triggering operation of an interaction identifier, and acquiring interaction data corresponding to the interaction identifier at a designated frame time, wherein the interaction data comprises multi-angle free view angle data; and based on the interaction data, performing image display of the multi-angle free view angles at the appointed frame time. By adopting the scheme, the interactive data can be acquired according to the triggering operation of the interactive identification in the playing process, and then multi-angle free view display is performed, so that the user interaction experience is improved.

Description

Data interaction method and system, interactive terminal, and readable storage medium

技术领域technical field

本发明实施例涉及数据处理技术领域，尤其涉及一种数据交互方法及系统、交互终端、可读存储介质。The embodiments of the present invention relate to the technical field of data processing, and in particular to a data interaction method and system, an interaction terminal, and a readable storage medium.

背景技术Background technique

随着互联技术的不断发展，数据传输能力大幅度提升，用户可以通过各种不同的终端设备观看视频。With the continuous development of Internet technology, data transmission capability has been greatly improved, and users can watch videos through various terminal devices.

在各类播放场景中，用户观看视频通常基于固定的视角，用户不能自由的切换视点位置进行观看，用户体验有待提升。In various playback scenarios, users usually watch videos based on a fixed viewing angle, and users cannot freely switch viewpoints for viewing, and user experience needs to be improved.

发明内容Contents of the invention

本发明实施例解决的问题是提供一种数据交互方法及系统、交互终端、可读存储介质，在观看视频时，可以自由地切换视点位置进行互动。The problem to be solved by the embodiments of the present invention is to provide a data interaction method and system, an interactive terminal, and a readable storage medium, which can freely switch viewpoint positions for interaction when watching a video.

本发明实施例公开了一种数据交互方法，包括：从播放控制设备实时获取待播放数据流并进行实时播放展示，所述待播放数据流包括视频数据及互动标识，各互动标识与所述待播放数据流的指定帧时刻关联；响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，所述交互数据包括多角度自由视角数据；基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示。The embodiment of the present invention discloses a data interaction method, which includes: acquiring data streams to be played from a playback control device in real time and performing real-time playback and display, the data streams to be played include video data and interaction identifiers, and each interaction identifier is related to the to-be-played data streams. The specified frame time of the playback data stream is associated; in response to a trigger operation on an interactive mark, the interactive data corresponding to the specified frame time of the interactive mark is obtained, and the interactive data includes multi-angle free view data; based on the interactive data , performing multi-angle free-view image display at the specified frame time.

可选地，所述多角度自由视角数据基于接收的所述指定帧时刻对应的多个帧图像生成，所述多个帧图像由数据处理设备对采集阵列中多个采集设备同步采集的多路视频数据流在所述指定帧时刻进行截取得到，所述多角度自由视角数据包括所述多个帧图像的像素数据、深度数据，以及参数数据，其中每个帧图像的像素数据以及深度数据之间存在关联关系。Optionally, the multi-angle free viewing angle data is generated based on multiple received frame images corresponding to the specified frame time, and the multiple frame images are synchronously collected by the data processing device for multiple acquisition devices in the acquisition array. The video data stream is obtained by intercepting the specified frame time, and the multi-angle free view data includes pixel data, depth data, and parameter data of the multiple frame images, wherein the pixel data and depth data of each frame image There is a relationship between them.

可选地，所述数据处理设备对采集阵列中多个采集设备同步采集的多路视频数据流在所述指定帧时刻进行截取，包括：所述数据处理设备基于接收到的视频帧截取指令，截取所述多路视频数据流中所述指定帧时刻的帧级同步的视频帧。Optionally, the data processing device intercepts the multiple video data streams synchronously collected by multiple collection devices in the collection array at the specified frame time, including: the data processing device intercepts the instruction based on the received video frame, Intercepting frame-level synchronized video frames at the specified frame time in the multiple video data streams.

可选地，所述采集阵列中多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，所述数据处理设备置于现场非采集区域或云端。Optionally, multiple acquisition devices in the acquisition array are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and the data processing equipment is placed in the on-site non-acquisition area or in the cloud.

可选地，所述互动标识由所述播放控制设备生成，所述播放控制设备基于来自数据处理设备的截取视频帧的帧时刻信息，生成与所述待播放数据流中对应时刻的视频帧关联的互动标识。Optionally, the interaction identifier is generated by the playback control device, and the playback control device generates a video frame associated with the corresponding time in the data stream to be played based on the frame time information of the intercepted video frame from the data processing device. interaction logo.

可选地，所述交互数据还包括以下至少一种：现场分析数据、采集对象的信息数据、与采集对象关联的装备的信息数据、现场部署的物品的信息数据、现场展示的徽标的信息数据。Optionally, the interaction data further includes at least one of the following: on-site analysis data, information data of the collection object, information data of equipment associated with the collection object, information data of items deployed on-site, and information data of logos displayed on-site .

可选地，数据交互方法还包括：在检测到交互结束信号时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示。Optionally, the data interaction method further includes: when an interaction end signal is detected, switching to the data stream to be played obtained in real time from the playback control device and performing real-time playback and presentation.

可选地，在检测到交互结束信号时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示，包括以下至少一种：在接收到交互结束操作指示时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示；在检测到所述指定帧时刻的多角度自由视角的图像展示至最后一幅图像时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示。Optionally, when an interaction end signal is detected, switching to the data stream to be played obtained in real time from the playback control device and performing real-time playback display, including at least one of the following: when receiving an interaction end operation instruction, switching to The data stream to be played is obtained in real time from the playback control device and displayed in real time; when the image of the multi-angle free viewing angle at the specified frame time is detected to be displayed to the last image, switch to from the playback control device The data stream to be played is obtained in real time and displayed in real time.

可选地，所述基于所述交互数据，进行多角度自由视角图像展示，包括：根据所述交互操作确定虚拟视点，所述虚拟视点选自多角度自由视角范围，所述多角度自由视角范围为支持对待观看区域进行虚拟视点的切换观看的范围；展示基于所述虚拟视点对所述待观看区域进行观看的图像，所述图像基于所述交互数据以及所述虚拟视点生成。Optionally, the multi-angle free viewing angle image display based on the interaction data includes: determining a virtual viewpoint according to the interactive operation, the virtual viewpoint being selected from a multi-angle free viewing range, and the multi-angle free viewing range In order to support the switching viewing range of the virtual viewpoint of the region to be viewed; displaying an image of viewing the region to be viewed based on the virtual viewpoint, the image is generated based on the interaction data and the virtual viewpoint.

本发明实施例还提供了一种数据处理系统，包括：采集阵列、数据处理设备、服务器、播放控制设备、以及交互终端；其中：The embodiment of the present invention also provides a data processing system, including: an acquisition array, a data processing device, a server, a playback control device, and an interactive terminal; wherein:

所述采集阵列包括多个采集设备，所述多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，适于实时同步采集多路视频数据流，并实时上传视频数据流至所述数据处理设备；所述数据处理设备，对于上传的多路视频数据流，适于根据接收到的视频帧截取指令，在指定帧时刻对所述多路视频数据流进行截取，得到对应所述指定帧时刻的多个帧图像以及对应所述指定帧时刻的帧时刻信息，并将所述指定帧时刻的多个帧图像及对应所述指定帧时刻的帧时刻信息上传至所述服务器，将所述指定帧时刻的帧时刻信息发送至所述播放控制设备；所述服务器，适于接收所述数据处理设备上传的所述多个帧图像以及所述帧时刻信息，并基于所述多个帧图像，生成用于进行交互的交互数据，所述交互数据包括多角度自由视角数据，所述交互数据与所述帧时刻信息关联；所述播放控制设备，适于确定待播放数据流中与所述数据处理设备上传的所述帧时刻信息对应的指定帧时刻，生成关联所述指定帧时刻的互动标识，并将包含所述互动标识的待播放数据流传输至所述交互终端；所述交互终端，适于基于接收到的待播放数据流，实时播放展示包含所述互动标识的视频，并基于对所述互动标识的触发操作，获取存储于所述服务器且对应所述指定帧时刻的交互数据，以进行多角度自由视角图像展示。The acquisition array includes a plurality of acquisition devices, and the plurality of acquisition devices are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and are suitable for real-time synchronous acquisition of multiple video data streams, and real-time upload of video data streams To the data processing device; the data processing device, for the uploaded multiple video data streams, is adapted to intercept the multiple video data streams at a specified frame time according to the received video frame interception instruction to obtain the corresponding Multiple frame images at the specified frame time and frame time information corresponding to the specified frame time, and uploading the multiple frame images at the specified frame time and frame time information corresponding to the specified frame time to the server , sending the frame time information of the specified frame time to the playback control device; the server is adapted to receive the plurality of frame images and the frame time information uploaded by the data processing device, and based on the A plurality of frame images is used to generate interaction data for interaction, the interaction data includes multi-angle free view data, the interaction data is associated with the frame time information; the playback control device is adapted to determine the data stream to be played In the specified frame time corresponding to the frame time information uploaded by the data processing device, an interaction identifier associated with the specified frame time is generated, and the data stream to be played including the interaction identifier is transmitted to the interactive terminal; The interactive terminal is adapted to play and display the video containing the interactive logo in real time based on the received data stream to be played, and based on a trigger operation on the interactive logo, obtain the video stored in the server and corresponding to the specified frame Momentary interactive data for multi-angle free-view image display.

可选地，所述数据处理设备适于对采集阵列中多个采集设备同步采集的多路视频数据流在所述指定帧时刻进行截取得到；Optionally, the data processing device is adapted to intercept multiple video data streams synchronously collected by multiple collection devices in the collection array at the specified frame moment;

所述服务器适于基于接收到的对应所述指定帧时刻的多个帧图像生成所述多角度自由视角数据，所述多角度自由视角数据包括所述多个帧图像的像素数据、深度数据，以及参数数据，其中每个帧图像的像素数据以及深度数据之间存在关联关系。The server is adapted to generate the multi-angle free viewing angle data based on the received multiple frame images corresponding to the specified frame time, the multi-angle free viewing angle data including pixel data and depth data of the multiple frame images, And parameter data, wherein there is an association relationship between the pixel data and the depth data of each frame image.

可选地，所述采集阵列中多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，所述数据处理设备置于现场非采集区域或云端，所述服务器置于现场非采集区域、云端或者终端。Optionally, multiple acquisition devices in the acquisition array are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, the data processing equipment is placed in the on-site non-acquisition area or in the cloud, and the server is placed on-site Non-acquisition area, cloud or terminal.

可选地，所述播放控制设备适于基于数据处理设备截取得到的视频帧的帧信息时刻，生成与所述待播放数据流中对应时刻的视频帧关联的互动标识。Optionally, the playback control device is adapted to generate an interaction identifier associated with a video frame at a corresponding time in the data stream to be played based on the frame information time of the video frame intercepted by the data processing device.

可选地，所述交互终端还适于在检测到交互结束信号时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示。Optionally, the interactive terminal is further adapted to switch to the data stream to be played acquired in real time from the playback control device and perform real-time playback and presentation when the interaction end signal is detected.

本发明实施例还提供了一种数据处理系统，包括：采集阵列、数据处理设备、播放控制设备、以及交互终端；其中：The embodiment of the present invention also provides a data processing system, including: an acquisition array, a data processing device, a playback control device, and an interactive terminal; wherein:

所述采集阵列包括多个采集设备，所述多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，适于实时同步采集多路视频数据流，并实时上传视频数据流至所述数据处理设备；所述数据处理设备，对于上传的多路视频数据流，适于根据接收到的视频帧截取指令，在指定帧时刻对所述多路视频数据流进行截取，得到对应所述指定帧时刻的多个帧图像以及对应所述指定帧时刻的帧时刻信息，并将所述指定帧时刻的帧时刻信息发送至所述播放控制设备；所述播放控制设备，适于确定待播放数据流中与所述数据处理设备上传的所述帧时刻信息对应的指定帧时刻，生成关联所述指定帧时刻的互动标识，并将包含所述互动标识的待播放数据流传输至所述交互终端；所述交互终端，适于基于接收到的待播放数据流，实时播放展示包含所述互动标识的视频，并基于对所述互动标识的触发操作，从所述数据处理设备获取对应于所述互动标识的指定帧时刻的多个帧图像，并基于所述多个帧图像，生成用于进行交互的交互数据，再进行多角度自由视角图像展示，其中，所述交互数据包括多角度自由视角数据。The acquisition array includes a plurality of acquisition devices, and the plurality of acquisition devices are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and are suitable for real-time synchronous acquisition of multiple video data streams, and real-time upload of video data streams To the data processing device; the data processing device, for the uploaded multiple video data streams, is adapted to intercept the multiple video data streams at a specified frame time according to the received video frame interception instruction to obtain the corresponding A plurality of frame images at the specified frame time and frame time information corresponding to the specified frame time, and sending the frame time information at the specified frame time to the playback control device; the playback control device is adapted to determine In the data stream to be played, at the specified frame time corresponding to the frame time information uploaded by the data processing device, an interaction identifier associated with the specified frame time is generated, and the data stream to be played containing the interaction identifier is transmitted to the The interactive terminal; the interactive terminal is adapted to play and display the video containing the interactive logo in real time based on the received data stream to be played, and obtain the corresponding video from the data processing device based on a trigger operation on the interactive logo. Based on the multiple frame images at the specified frame time of the interaction mark, and based on the multiple frame images, interactive data for interaction is generated, and then multi-angle free-view image display is performed, wherein the interactive data includes multiple Angle free view data.

本发明实施例还提供了一种交互终端，包括：The embodiment of the present invention also provides an interactive terminal, including:

数据流获取单元，适于从播放控制设备实时获取待播放数据流，所述待播放数据流包括视频数据及互动标识，所述互动标识与所述待播放数据流的指定帧时刻关联；播放展示单元，适于实时播放展示所述待播放数据流的视频及互动标识；交互数据获取单元，适于响应于对所述互动标识的触发操作，获取对应于所述指定帧时刻的交互数据，所述交互数据包括多角度自由视角数据；交互展示单元，适于基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示。The data stream acquisition unit is adapted to obtain the data stream to be played in real time from the playback control device, the data stream to be played includes video data and an interactive identifier, and the interactive identifier is associated with a specified frame time of the data stream to be played; The unit is adapted to play in real time the video displaying the data stream to be played and the interactive sign; the interactive data acquisition unit is adapted to respond to the trigger operation on the interactive sign and acquire the interactive data corresponding to the specified frame moment, so The interactive data includes multi-angle free-view data; the interactive display unit is adapted to perform multi-angle free-view image display at the specified frame time based on the interactive data.

可选地，交互终端还包括：切换单元，适于在检测到交互结束信号时，触发切换至由所述数据流获取单元从所述播放控制设备实时获取的待播放数据流并由所述播放展示单元进行实时播放展示。Optionally, the interaction terminal further includes: a switching unit, adapted to trigger switching to the data stream to be played acquired by the data stream acquisition unit from the playback control device in real time when an interaction end signal is detected, and the playback The display unit performs real-time playback and display.

本发明实施例还提供了另一种交互终端，包括：处理器，网络组件，存储器和显示部件；其中：The embodiment of the present invention also provides another interactive terminal, including: a processor, a network component, a memory, and a display component; wherein:

所述处理器，适于通过网络组件实时获取待播放数据流，以及响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，其中，所述待播放数据流包括视频数据及互动标识，所述互动标识与所述待播放数据流的指定帧时刻关联，所述交互数据包括多角度自由视角数据；所述存储器，适于存储实时获取的待播放数据流；所述显示部件，适于基于实时获取的待播放数据流，实时播放展示所述待播放数据流的视频及互动标识，以及基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示。本发明实施例还提供了另一种交互终端，包括存储器和处理器，所述存储器上存储有能够在所述处理器上运行的计算机指令，其特征在于，所述处理器运行所述计算机指令时执行本发明任一实施例所述数据交互方法的步骤。The processor is adapted to obtain the data stream to be played in real time through the network component, and in response to a trigger operation on an interaction mark, obtain the interaction data corresponding to the specified frame time of the interaction mark, wherein the data to be played The stream includes video data and an interactive identifier, the interactive identifier is associated with the specified frame time of the data stream to be played, the interactive data includes multi-angle free view data; the memory is suitable for storing the data stream to be played acquired in real time The display component is adapted to play in real time the video and interactive logo showing the data stream to be played based on the data stream to be played acquired in real time, and perform multi-angle free viewing angles at the specified frame time based on the interactive data image display. The embodiment of the present invention also provides another interactive terminal, including a memory and a processor, the memory stores computer instructions that can be run on the processor, and it is characterized in that the processor runs the computer instructions When performing the steps of the data interaction method described in any embodiment of the present invention.

本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机指令，所述计算机指令运行时执行本发明任一实施例所述数据交互方法的步骤。The embodiment of the present invention also provides a computer-readable storage medium, on which computer instructions are stored, and the steps of the data interaction method described in any embodiment of the present invention are executed when the computer instructions are run.

与现有技术相比，本发明的技术方案具有以下有益效果：Compared with the prior art, the technical solution of the present invention has the following beneficial effects:

本发明实施例中，可以通过从播放控制设备实时获取待播放数据流并进行实时播放展示，待播放数据流中各互动标识与所述待播放数据流的指定帧时刻关联，然后，可以响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，由于所述交互数据可以包括多角度自由视角数据，因此可以基于所述交互数据，对所述指定帧时刻进行多角度自由视角展示。由上可知，采用本发明实施例，在播放过程中，可以根据互动标识的触发操作获取交互数据，进而进行多角度自由视角展示，以提升用户交互体验。In the embodiment of the present invention, the data stream to be played can be acquired in real time from the playback control device and displayed in real time, and each interaction identifier in the data stream to be played is associated with the specified frame time of the data stream to be played, and then, can respond to For the trigger operation of an interactive sign, the interaction data corresponding to the specified frame moment of the interactive sign is acquired. Since the interactive data may include multi-angle free view data, the specified frame moment can be calculated based on the interaction data. Perform multi-angle free viewing angle display. It can be seen from the above that, by adopting the embodiment of the present invention, during the playback process, the interaction data can be obtained according to the trigger operation of the interaction sign, and then multi-angle free-view display can be performed, so as to improve the user interaction experience.

进一步地，通过包含现场分析数据、采集对象的信息数据、与采集对象关联的装备的信息数据、现场部署的物品的信息数据、现场展示的徽标的信息数据等其中一种或多种的交互数据，进行多角度自由视角展示，可以向用户通过多角度自由视角展示更加丰富的交互信息，从而可以进一步增强用户交互体验。Further, by including one or more interactive data including on-site analysis data, information data of the collection object, information data of equipment associated with the collection object, information data of items deployed on-site, information data of logos displayed on-site, etc. , performing multi-angle free viewing angle display, can display richer interactive information to the user through multi-angle free viewing angles, thereby further enhancing the user interaction experience.

附图说明Description of drawings

图1是本发明实施例中一种数据处理系统的结构示意图；Fig. 1 is a schematic structural diagram of a data processing system in an embodiment of the present invention;

图2是本发明实施例中一种数据处理方法的流程图；Fig. 2 is a flow chart of a data processing method in an embodiment of the present invention;

图3是本发明实施例中一种应用场景中数据处理系统的结构示意图；3 is a schematic structural diagram of a data processing system in an application scenario in an embodiment of the present invention;

图4是本发明实施例中一种交互终端的交互界面示意图；FIG. 4 is a schematic diagram of an interactive interface of an interactive terminal in an embodiment of the present invention;

图5是本发明实施例中一种服务器的结构示意图；FIG. 5 is a schematic structural diagram of a server in an embodiment of the present invention;

图6是本发明实施例中一种数据交互方法的流程图；FIG. 6 is a flowchart of a data interaction method in an embodiment of the present invention;

图7是本发明实施例中另一种数据处理系统的结构示意图；7 is a schematic structural diagram of another data processing system in an embodiment of the present invention;

图8是本发明实施例中另一种应用场景中数据处理系统的结构示意图；FIG. 8 is a schematic structural diagram of a data processing system in another application scenario according to an embodiment of the present invention;

图9是本发明实施例中一种交互终端的结构示意图；FIG. 9 is a schematic structural diagram of an interactive terminal in an embodiment of the present invention;

图10是本发明实施例中另一种交互终端的交互界面示意图；FIG. 10 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention;

图11是本发明实施例中另一种交互终端的交互界面示意图；Fig. 11 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention;

图12是本发明实施例中另一种交互终端的交互界面示意图；FIG. 12 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention;

图13是本发明实施例中另一种交互终端的交互界面示意图；Fig. 13 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention;

图14是本发明实施例中另一种交互终端的交互界面示意图；FIG. 14 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention;

图15是本发明实施例中另一种数据处理方法的流程图；Fig. 15 is a flow chart of another data processing method in the embodiment of the present invention;

图16是本发明实施例中一种截取压缩视频数据量中同步的视频帧的方法的流程图；16 is a flow chart of a method for intercepting synchronized video frames in compressed video data in an embodiment of the present invention;

图17是本发明实施例中另一种数据处理方法的流程图；Fig. 17 is a flowchart of another data processing method in the embodiment of the present invention;

图18是本发明实施例中一种数据处理设备的结构示意图；Fig. 18 is a schematic structural diagram of a data processing device in an embodiment of the present invention;

图19是本发明实施例中另一种数据处理系统的结构示意图；Fig. 19 is a schematic structural diagram of another data processing system in an embodiment of the present invention;

图20是本发明实施例中一种数据同步方法的流程图；FIG. 20 is a flowchart of a data synchronization method in an embodiment of the present invention;

图21是本发明实施例中一种拉流同步的时序图；FIG. 21 is a timing diagram of pull-stream synchronization in an embodiment of the present invention;

图22是本发明实施例中另一种截取压缩视频数据量中同步的视频帧的方法的流程图；22 is a flow chart of another method for intercepting synchronized video frames in compressed video data in an embodiment of the present invention;

图23是本发明实施例中另一种数据处理设备的结构示意图；Fig. 23 is a schematic structural diagram of another data processing device in an embodiment of the present invention;

图24是本发明实施例中一种数据同步系统的结构示意图；Fig. 24 is a schematic structural diagram of a data synchronization system in an embodiment of the present invention;

图25是本发明实施例中一种应用场景中的数据同步系统的结构示意图；Fig. 25 is a schematic structural diagram of a data synchronization system in an application scenario in an embodiment of the present invention;

图26是本发明实施例中一种深度图生成方法的流程图；Fig. 26 is a flowchart of a method for generating a depth map in an embodiment of the present invention;

图27是本发明实施例中一种服务器的结构示意图；Fig. 27 is a schematic structural diagram of a server in an embodiment of the present invention;

图28是本发明实施例中一种服务器集群进行深度图处理的示意图；Fig. 28 is a schematic diagram of a server cluster performing depth map processing in an embodiment of the present invention;

图29是本发明实施例中一种虚拟视点图像生成方法的流程图；Fig. 29 is a flow chart of a method for generating a virtual viewpoint image in an embodiment of the present invention;

图30是本发明实施例中一种GPU进行组合渲染的方法的流程图；FIG. 30 is a flow chart of a method for combined rendering by GPU in an embodiment of the present invention;

图31是本发明实施例中一种空洞填补方法的示意图；Fig. 31 is a schematic diagram of a void filling method in an embodiment of the present invention;

图32是本发明实施例中一种虚拟视点图像生成系统的结构示意图；Fig. 32 is a schematic structural diagram of a virtual viewpoint image generation system in an embodiment of the present invention;

图33是本发明实施例中一种电子设备的结构示意图。Fig. 33 is a schematic structural diagram of an electronic device in an embodiment of the present invention.

图34是本发明实施例中另一种数据同步系统的结构示意图；Fig. 34 is a schematic structural diagram of another data synchronization system in an embodiment of the present invention;

图35是本发明实施例中另一种数据同步系统的结构示意图；Fig. 35 is a schematic structural diagram of another data synchronization system in an embodiment of the present invention;

图36是本发明实施例中一种采集设备的结构示意图。Fig. 36 is a schematic structural diagram of a collection device in an embodiment of the present invention.

图37是本发明实施例中一种应用场景中采集阵列的示意图。Fig. 37 is a schematic diagram of an acquisition array in an application scenario according to an embodiment of the present invention.

图38是本发明实施例中另一种数据处理系统的结构示意图。Fig. 38 is a schematic structural diagram of another data processing system in an embodiment of the present invention.

图39是本发明实施例中另一种交互终端的结构示意图。Fig. 39 is a schematic structural diagram of another interactive terminal in an embodiment of the present invention.

图40是本发明实施例中另一种交互终端的交互界面示意图。Fig. 40 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图41是本发明实施例中另一种交互终端的交互界面示意图。Fig. 41 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图42是本发明实施例中另一种交互终端的交互界面示意图。Fig. 42 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图43是本发明实施例中一种交互终端的连接示意图。Fig. 43 is a schematic diagram of connection of an interactive terminal in an embodiment of the present invention.

图44是本发明实施例中一种交互终端的交互操作示意图。Fig. 44 is a schematic diagram of an interactive operation of an interactive terminal in an embodiment of the present invention.

图45是本发明实施例中另一种交互终端的交互界面示意图。Fig. 45 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图46是本发明实施例中另一种交互终端的交互界面示意图。Fig. 46 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图47是本发明实施例中另一种交互终端的交互界面示意图。Fig. 47 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

图48是本发明实施例中另一种交互终端的交互界面示意图。Fig. 48 is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention.

具体实施方式Detailed ways

在传统的直播、转播和录播等播放场景中，如前所述，用户在观看过程中往往只能通过一个视点位置观看比赛，无法自己自由切换视点位置，来观看不同视角位置处的比赛画面或比赛过程，也就无法体验在现场一边移动视点一边看比赛的感觉。In traditional broadcasting scenarios such as live broadcasting, rebroadcasting, and recording, as mentioned above, users often can only watch the game through one viewpoint during the viewing process, and cannot freely switch viewpoints to watch the game at different viewpoints. Or the game process, it is impossible to experience the feeling of watching the game while moving the viewpoint on the spot.

采用6自由度(6Degree of Freedom，6DoF)技术可以提供高自由度观看体验，用户可以在观看过程中通过交互手段，来调整视频观看的视角，从想观看的自由视点角度进行观看，从而大幅度的提升观看体验。The 6 Degree of Freedom (6DoF) technology can provide a high-degree-of-freedom viewing experience. Users can adjust the viewing angle of the video through interactive means during the viewing process, and watch from the free viewpoint they want to watch. to enhance the viewing experience.

为实现6DoF场景，目前有Free-D回放技术、光场渲染技术及基于深度图的6DoF视频生成技术等。其中，Free-D回放技术是通过多角度拍摄获取场景的点云数据对6DoF图像进行表达，而光场渲染技术是通过密集光场的焦距和空间位置变化获得像素的景深信息和三维位置信息，进而对6DoF图像进行表达。基于深度图的6DoF视频生成方法是基于所述虚拟视点位置及对应组的纹理图和深度图对应的参数数据，将用户交互时刻所述视频帧的图像组合中相应组的纹理图和深度图进行组合渲染，进行6DoF图像或视频的重建。In order to realize 6DoF scenes, there are Free-D playback technology, light field rendering technology and 6DoF video generation technology based on depth map, etc. Among them, the Free-D playback technology is to obtain the point cloud data of the scene through multi-angle shooting to express the 6DoF image, and the light field rendering technology is to obtain the depth of field information and three-dimensional position information of the pixels through the focal length and spatial position changes of the dense light field. Then express the 6DoF image. The 6DoF video generation method based on the depth map is based on the parameter data corresponding to the virtual viewpoint position and the texture map and the depth map of the corresponding group, and performs the texture map and the depth map of the corresponding group in the image combination of the video frame at the moment of user interaction. Combined rendering for 6DoF image or video reconstruction.

例如，在现场使用Free-D回放方案时，需要采用大量相机进行原始数据采集，并通过数字分量串行接口(Serial Digital Interface，SDI)采集卡汇总到现场的计算机房，然后通过现场机房中的计算服务器对原始数据进行处理，获得对空间所有点的三维位置以及像素信息进行表达的点云数据，并重建6DoF场景。这种方案使得现场采集、传输和计算的数据量极大，尤其对于直播和转播这类对传输网络以及计算服务器有很高要求的播放场景，6DoF重建场景实施成本过高，限制条件过多。并且，目前并没有很好的技术标准和工业级软硬件对点云数据进行支持，因此，从现场的原始数据采集到最终6DoF重建场景，需要花费较长的数据处理时间，从而不能满足多角度自由视角视频的低时延播放和实时互动的需求。For example, when the Free-D playback solution is used on site, a large number of cameras are required to collect raw data, which is collected to the on-site computer room through a digital component serial interface (Serial Digital Interface, SDI) capture card, and then through the on-site computer room. The computing server processes the raw data, obtains point cloud data expressing the three-dimensional positions and pixel information of all points in the space, and reconstructs the 6DoF scene. This solution makes the amount of data collected, transmitted and calculated on-site extremely large, especially for broadcast scenarios such as live broadcast and rebroadcast that have high requirements on transmission networks and computing servers, the implementation cost of 6DoF reconstruction scenarios is too high, and there are too many restrictions. Moreover, there are currently no good technical standards and industrial-grade software and hardware to support point cloud data. Therefore, it takes a long time to process data from the original data collection on site to the final 6DoF reconstruction scene, which cannot meet the multi-angle requirements. Requirements for low-latency playback and real-time interaction of free-view videos.

又例如，在现场使用光场渲染方案时，需要通过密集光场的焦距和空间位置变化获得像素的景深信息和三维位置信息，由于密集光场获取的光场图像分辨率过大，往往需要分解成几百张常规的二维图片，因此，这种方案也使得现场采集、传输和计算的数据量极大，对于现场的传输网络以及计算服务器有很高的要求，实施成本过高，限制条件过多，无法快速处理数据。并且，通过光场图像重建6DoF场景的技术手段仍然在实验探索中，目前无法有效地满足多角度自由视角视频的低时延播放和实时互动的需求。For another example, when using the light field rendering scheme on site, it is necessary to obtain the depth information and three-dimensional position information of pixels through the focal length and spatial position changes of the dense light field. Since the resolution of the light field image obtained by the dense light field is too large, it often needs to be decomposed There are hundreds of conventional two-dimensional pictures. Therefore, this solution also makes the amount of data collected, transmitted and calculated on site extremely large. It has high requirements for the transmission network and calculation server on site, and the implementation cost is too high. Too many to process data quickly. Moreover, the technical means of reconstructing 6DoF scenes from light field images is still under experimental exploration, and currently cannot effectively meet the needs of low-latency playback and real-time interaction of multi-angle free-view videos.

综上所述，无论是Free-D回放技术还是光场渲染技术，对存储量和运算量的需求都非常大，因此需要在现场布置大量服务器进行处理，造成实施成本过高，限制条件过多，无法快速处理数据，从而不能满足观看和互动的需求，不利于推广普及。To sum up, both the Free-D playback technology and the light field rendering technology require a large amount of storage and calculation, so a large number of servers need to be deployed on site for processing, resulting in high implementation costs and too many restrictions , cannot quickly process data, thus cannot meet the needs of viewing and interaction, and is not conducive to popularization.

基于深度图的6DoF视频重建方法虽然可以减小视频重建过程中的数据运算量，但由于网络传输带宽、设备解码能力等多种因素的约束，也难以满足多角度自由视角视频的低时延播放及实时互动的需求。Although the 6DoF video reconstruction method based on the depth map can reduce the amount of data calculation in the video reconstruction process, it is difficult to meet the low-latency playback of multi-angle free-view video due to the constraints of various factors such as network transmission bandwidth and device decoding capability. and real-time interactive needs.

针对上述问题，本发明一些实施例提出了多角度自由视角图像生成方案，采用分布式系统架构，其中，在现场采集区域设置多个采集设备组成的采集阵列进行多个角度的帧图像的同步采集，通过数据处理设备对采集设备采集到的帧图像根据帧截取指令进行视频帧的截取，并由服务器将数据处理设备上传的多个同步视频帧的帧图像作为图像组合，可以确定所述图像组合相应的参数数据和所述图像组合中各帧图像的深度数据，基于所述图像组合相应的参数数据、所述图像组合中预设帧图像的像素数据和深度数据，对预设的虚拟视点路径进行帧图像重建，获得相应的多角度自由视角视频数据，并将所述多角度自由视角视频数据插入至播放控制设备的待播放数据流以用于传输至播放终端进行播放。In response to the above problems, some embodiments of the present invention propose a multi-angle free-view image generation scheme, using a distributed system architecture, wherein an acquisition array composed of multiple acquisition devices is set in the on-site acquisition area to perform synchronous acquisition of frame images from multiple angles , the frame image collected by the acquisition device is intercepted by the data processing device according to the frame interception instruction, and the frame image of a plurality of synchronous video frames uploaded by the data processing device is used as an image combination by the server, and the image combination can be determined The corresponding parameter data and the depth data of each frame image in the image combination, based on the corresponding parameter data of the image combination, the pixel data and depth data of the preset frame image in the image combination, the preset virtual viewpoint path Perform frame image reconstruction to obtain corresponding multi-angle free-view video data, and insert the multi-angle free-view video data into the data stream to be played of the playback control device for transmission to the playback terminal for playback.

参照本发明实施例中一种应用场景的数据处理系统的结构示意图，数据处理系统10包括：数据处理设备11、服务器12、播放控制设备13和播放终端14，其中，数据处理设备11可以对现场采集区域中采集阵列采集到的帧图像进行视频帧的截取，通过对待生成多角度自由视角图像的视频帧进行截取，可以避免大量的数据传输及数据处理，之后，由服务器12进行多角度自由视角图像的生成，可以充分利用服务器强大的计算能力，即可快速地生成多角度自由视角视频数据，从而可以及时地插入播放控制设备的待播放数据流中，以低廉的成本实现多角度自由视角的播放，满足用户对多角度自由视角视频低时延播放和实时互动的需求。With reference to the schematic structural diagram of a data processing system in an application scenario in the embodiment of the present invention, the data processing system 10 includes: a data processing device 11, a server 12, a playback control device 13 and a playback terminal 14, wherein the data processing device 11 can be on-site The frame image collected by the collection array in the collection area is used to intercept the video frame. By intercepting the video frame to generate the multi-angle free view image, a large amount of data transmission and data processing can be avoided. After that, the server 12 performs multi-angle free view The generation of images can make full use of the powerful computing power of the server to quickly generate multi-angle free-view video data, so that it can be inserted into the data stream to be played by the playback control device in time, and realize multi-angle free-view at low cost. Playback to meet users' needs for low-latency playback and real-time interaction of multi-angle free-view videos.

为使本领域技术人员更加清楚地了解及实施本说明书实施例，下面将结合本说明书实施例中的附图，对本说明书实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本说明书的一部分实施例，而不是全部实施例。基于本说明书中的实施例，本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都应属于本说明书保护的范围。In order for those skilled in the art to understand and implement the embodiments of this specification more clearly, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the drawings in the embodiments of this specification. Obviously, the described implementation Examples are some of the embodiments of this specification, not all of them. Based on the embodiments in this specification, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of this specification.

参照图2所示的数据处理方法的流程图，在本发明实施例中，具体可以包括如下步骤：Referring to the flow chart of the data processing method shown in Figure 2, in the embodiment of the present invention, the following steps may be specifically included:

S21，接收数据处理设备上传的多个同步视频帧的帧图像作为图像组合。S21. Receive frame images of multiple synchronous video frames uploaded by the data processing device as an image combination.

其中，所述多个同步视频帧为所述数据处理设备基于视频帧截取指令，在从现场采集区域不同位置实时同步采集并上传的多路视频数据流中对指定帧时刻的视频帧截取得到，所述多个同步视频帧的拍摄视角不同。Wherein, the plurality of synchronous video frames are obtained by the data processing device based on video frame interception instructions, and are obtained by intercepting video frames at specified frame times from multiple video data streams that are synchronously collected and uploaded in real time from different locations in the on-site collection area, The shooting angles of the multiple synchronous video frames are different.

在具体实施中，所述视频帧截取指令可以包括指定帧时刻的信息，所述数据处理设备根据所述视频帧截取指令中的指定帧时刻的信息，从多路视频数据流中截取相应帧时刻的视频帧。其中，所述指定帧时刻可以以帧为单位，将第N至M帧作为指定帧时刻，N和M均为不小于1的整数，且N≤M；或者，所述指定帧时刻也可以以时间为单位，将第X至Y秒作为指定帧时刻，X和Y均为正数，且X≤Y。因此，多个同步视频帧可以包括指定帧时刻对应的所有帧级同步的视频帧，各视频帧的像素数据形成对应的帧图像。In a specific implementation, the video frame interception instruction may include information of a designated frame time, and the data processing device intercepts the corresponding frame time from multiple video data streams according to the information of the designated frame time in the video frame interception instruction video frames. Wherein, the specified frame time may be in units of frames, and the Nth to M frames are used as the specified frame time, N and M are both integers not less than 1, and N≤M; or, the specified frame time may also be in The unit of time is X to Y seconds as the specified frame time, both X and Y are positive numbers, and X≤Y. Therefore, the plurality of synchronized video frames may include all frame-level synchronized video frames corresponding to a specified frame moment, and the pixel data of each video frame forms a corresponding frame image.

例如，数据处理设备根据接收到的视频帧截取指令，可以获得指定帧时刻为多路视频数据流中的第2帧，则数据处理设备分别截取各路视频数据流中第2帧的视频帧，且截取的各路视频数据流的第2帧的视频帧之间帧级同步，作为获取得到的多个同步视频帧。For example, the data processing device can obtain the second frame in the multi-channel video data stream at the specified frame time according to the received video frame interception instruction, then the data processing device intercepts the video frame of the second frame in each video data stream respectively, And the frame-level synchronization between the video frames of the second frame of the intercepted video data streams of each channel is used as a plurality of synchronized video frames obtained.

又例如，假设采集帧率设置为25fps，即1秒采集25帧，数据处理设备根据接收到的视频帧截取指令，可以获得指定帧时刻为多路视频数据流中的第1秒内的视频帧，则数据处理设备可以分别截取各路视频数据流中第1秒内的25个视频帧，且截取的各路视频数据流中第1秒内的第1个视频帧之间帧级同步，截取的各路视频数据流中第1秒内的第2个视频帧之间帧级同步，直至取的各路视频数据流中第1秒内的第25个视频帧之间帧级同步，作为获取得到的多个同步视频帧。For another example, assume that the acquisition frame rate is set to 25fps, that is, 25 frames are acquired in 1 second, and the data processing device can obtain the video frame within the first second of the multi-channel video data stream at the specified frame time according to the received video frame interception instruction , the data processing device can respectively intercept 25 video frames in the first second of each video data stream, and the frame-level synchronization between the first video frames in the first second of the intercepted video data streams in each channel, the interception The frame-level synchronization between the second video frame in the first second in each video data stream of the obtained video data stream, until the frame-level synchronization between the 25th video frame in the first second in each video data stream obtained, as the acquisition The resulting multiple synchronized video frames.

还例如，数据处理设备根据接收到的视频帧截取指令，可以获得指定帧时刻为多路视频数据流中的第2帧和第3帧，则数据处理设备可以分别截取各路视频数据流中第2帧的视频帧和第3帧的视频帧，且截取的各路视频数据流的第2帧的视频帧之间和第3帧的视频帧之间分别帧级同步，作为多个同步视频帧。Also for example, the data processing device can obtain the second frame and the third frame in the multi-channel video data stream at the specified frame time according to the received video frame interception instruction, then the data processing device can intercept the first frame in each video data stream respectively. The video frames of the 2nd frame and the video frame of the 3rd frame, and the video frames of the 2nd frame and the video frames of the 3rd frame of the intercepted video data streams of each channel are respectively frame-level synchronized, as multiple synchronized video frames .

S22，确定所述图像组合相应的参数数据。S22. Determine parameter data corresponding to the image combination.

在具体实施中，可以通过参数矩阵来获得所述图像组合相应的参数数据，所述参数矩阵可以包括内参矩阵，外参矩阵、旋转矩阵和平移矩阵等。由此，可以确定空间物体表面指定点的三维几何位置与其在图像组合中对应点之间的相互关系。In a specific implementation, parameter data corresponding to the image combination can be obtained through a parameter matrix, and the parameter matrix can include an internal parameter matrix, an external parameter matrix, a rotation matrix, and a translation matrix. Thus, the relationship between the three-dimensional geometric position of the specified point on the surface of the space object and its corresponding point in the image combination can be determined.

在本发明的实施例中，可以采用运动重构(Structure From Motion，SFM)算法，基于参数矩阵，对获取到的图像组合进行特征提取、特征匹配和全局优化，获得的参数估计值作为图像组合相应的参数数据。其中，特征提取采用的算法可以包括以下任意一种：尺度不变特征变换(Scale-Invariant Feature Transform，SIFT)算法、加速稳健特征(Speeded-Up Robust Features，SURF)算法、加速段测试的特征(Features from AcceleratedSegment Test，FAST)算法。特征匹配采用的算法可以包括：欧式距离计算方法、随机样本一致性(Random Sample Consensus，RANSC)算法等。全局优化的算法可以包括：光束法平差(Bundle Adjustment，BA)等。In the embodiment of the present invention, the structure from motion (SFM) algorithm can be used to perform feature extraction, feature matching and global optimization on the obtained image combination based on the parameter matrix, and the obtained parameter estimation value is used as the image combination corresponding parameter data. Among them, the algorithm used for feature extraction can include any of the following: Scale-Invariant Feature Transform (SIFT) algorithm, Speeded-Up Robust Features (SURF) algorithm, accelerated section test features ( Features from AcceleratedSegment Test, FAST) algorithm. Algorithms adopted for feature matching may include: a Euclidean distance calculation method, a random sample consensus (Random Sample Consensus, RANSC) algorithm, and the like. Algorithms for global optimization may include: bundle adjustment (Bundle Adjustment, BA) and the like.

S23，确定所述图像组合中各帧图像的深度数据。S23. Determine depth data of each frame image in the image combination.

在具体实施中，可以基于所述图像组合中多个帧图像，确定各帧图像的深度数据。其中，深度数据可以包括与图像组合中各帧图像的像素对应的深度值。采集点到现场中各个点的距离可以作为上述深度值，深度值可以直接反映待观看区域中可见表面的几何形状。例如，以拍摄坐标系的原点作为光心，深度值可以是现场中各个点沿着拍摄光轴到光心的距离。本领域技术人员可以理解的是，上述距离可以是相对数值，多个帧图像可以采用相同的基准。In a specific implementation, the depth data of each frame image may be determined based on multiple frame images in the image combination. Wherein, the depth data may include depth values corresponding to pixels of each frame image in the image combination. The distance from the collection point to each point in the scene can be used as the above-mentioned depth value, and the depth value can directly reflect the geometric shape of the visible surface in the area to be viewed. For example, taking the origin of the shooting coordinate system as the optical center, the depth value can be the distance from each point in the scene to the optical center along the shooting optical axis. Those skilled in the art can understand that the above distance may be a relative value, and multiple frame images may use the same reference.

在本发明一实施例中，可以采用双目立体视觉的算法，计算各帧图像的深度数据。除此之外，深度数据还可以通过对帧图像的光度特征、明暗特征等特征进行分析间接估算得到。In an embodiment of the present invention, a binocular stereo vision algorithm may be used to calculate the depth data of each frame of image. In addition, the depth data can also be indirectly estimated by analyzing the photometric features, light and dark features and other features of the frame image.

在本发明另一实施例中，可以采用多视点三维重建(Mult-View Stereo，MVS)算法进行帧图像重建。重建过程中可以采用所有像素进行重建，也可以对像素进行降采样仅用部分像素重建。具体而言，可以对每个帧图像的像素点都进行匹配，重建每个像素点的三维坐标，获得具有图像一致性的点，然后计算各个帧图像的深度数据。或者，可以对选取的帧图像的像素点进行匹配，重建各选取的帧图像的像素点的三维坐标，获得具有图像一致性的点，然后计算相应帧图像的深度数据。其中，帧图像的像素数据与计算得到的深度数据对应，选取帧图像的方式可以根据具体情景来设定，比如，可以根据需要计算深度数据的帧图像与其他帧图像之间的距离，选择部分帧图像。In another embodiment of the present invention, a multi-view three-dimensional reconstruction (Mult-View Stereo, MVS) algorithm may be used for frame image reconstruction. In the reconstruction process, all pixels can be used for reconstruction, or pixels can be down-sampled and only some pixels can be used for reconstruction. Specifically, the pixel points of each frame image can be matched, the three-dimensional coordinates of each pixel point can be reconstructed, and the points with image consistency can be obtained, and then the depth data of each frame image can be calculated. Alternatively, the pixel points of the selected frame images can be matched, and the three-dimensional coordinates of the pixel points of each selected frame image can be reconstructed to obtain points with image consistency, and then the depth data of the corresponding frame images can be calculated. Among them, the pixel data of the frame image corresponds to the calculated depth data, the method of selecting the frame image can be set according to the specific situation, for example, the distance between the frame image of the depth data and other frame images can be calculated according to the needs, and the selected part frame image.

S24，基于所述图像组合相应的参数数据、所述图像组合中预设帧图像的像素数据和深度数据，对预设的虚拟视点路径进行帧图像重建，获得相应的多角度自由视角视频数据。S24. Based on the corresponding parameter data of the image combination, the pixel data and depth data of the preset frame image in the image combination, perform frame image reconstruction on the preset virtual viewpoint path, and obtain corresponding multi-angle free-view video data.

所述多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。The multi-angle free-view video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time.

其中，帧图像的像素数据可以为YUV数据或RGB数据中任意一种，或者也可以是其它能够对帧图像进行表达的数据；深度数据可以包括与帧图像的像素数据一一对应的深度值，或者，可以是对与帧图像的像素数据一一对应的深度值集合中选取的部分数值，具体的选取方式根据具体的情景而定；所述虚拟视点选自多角度自由视角范围，所述多角度自由视角范围为支持对待观看区域进行视点的切换观看的范围。Wherein, the pixel data of the frame image may be any one of YUV data or RGB data, or other data capable of expressing the frame image; the depth data may include a depth value corresponding to the pixel data of the frame image, Or, it may be some values selected from the depth value set corresponding to the pixel data of the frame image one-to-one. The specific selection method depends on the specific scene; The angle-free viewing angle range is a range that supports switching viewing of the viewing area.

在具体实施中，预设帧图像可以是图像组合中所有的帧图像，也可以是选择的部分帧图像。其中，选取的方式可以根据具体情景来设定，例如，可以根据采集点之间的位置关系，选择图像组合中相应位置的部分帧图像；又例如，可以根据想要获取的帧时刻或帧时段，选择图像组合中相应帧时刻的部分帧图像。In a specific implementation, the preset frame images may be all frame images in the image combination, or may be selected partial frame images. Among them, the selection method can be set according to the specific situation. For example, according to the positional relationship between the collection points, some frame images of the corresponding positions in the image combination can be selected; , select a partial frame image at the corresponding frame moment in the image combination.

由于所述预设的帧图像可以对应不同的帧时刻，因此，可以对虚拟视点路径中各虚拟视点与各帧时刻进行对应，根据各虚拟视点相对应的帧时刻获取相应的帧图像，然后基于所述图像组合相应的参数数据、各虚拟视点的帧时刻对应的帧图像的深度数据和像素数据，对各虚拟视点进行帧图像重建，获得相应的多角度自由视角视频数据，此时，多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。换言之，在具体实施中，除了可以实现某一个时刻的多角度自由视角图像，还可以实现时序上连续的或非连续的多角度自由视角视频。Since the preset frame images can correspond to different frame moments, each virtual viewpoint in the virtual viewpoint path can be associated with each frame moment, and the corresponding frame image can be obtained according to the frame moment corresponding to each virtual viewpoint, and then based on The image combines corresponding parameter data, depth data and pixel data of the frame image corresponding to the frame time of each virtual viewpoint, performs frame image reconstruction on each virtual viewpoint, and obtains corresponding multi-angle free-view video data. At this time, multi-angle The free-view video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time. In other words, in a specific implementation, in addition to realizing a multi-angle free-view image at a certain moment, time-series continuous or discontinuous multi-angle free-view videos may also be realized.

在本发明一实施例中，所述图像组合包括A个同步视频帧，其中，a1个同步视频帧对应第一帧时刻，a2个同步视频帧对应第二帧时刻，a1+a2＝A；并且，预设有B个虚拟视点组成的虚拟视点路径，其中b1个虚拟视点与第一帧时刻相对应，b2个虚拟视点与第二帧时刻相对应，b1+b2≤2B，则基于所述图像组合相应的参数数据、第一帧时刻的a1个同步视频帧的帧图像的像素数据和深度数据，对b1个虚拟视点组成的路径进行第一帧图像重建，基于所述图像组合相应的参数数据、第一帧时刻的a21个同步视频帧的帧图像的像素数据和深度数据，对b2个虚拟视点组成的路径进行第二帧图像重建，最终获得相应的多角度自由视角视频数据，其中，所述多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。In an embodiment of the present invention, the image combination includes A synchronous video frames, wherein, a1 synchronous video frames correspond to the first frame moment, a2 synchronous video frames correspond to the second frame moment, a1+a2=A; and , a virtual viewpoint path consisting of B virtual viewpoints is preset, wherein b1 virtual viewpoints correspond to the first frame moment, b2 virtual viewpoints correspond to the second frame moment, and b1+b2≤2B, then based on the image Combining the corresponding parameter data, the pixel data and depth data of the frame images of a1 synchronous video frames at the first frame time, reconstructing the first frame image on the path composed of b1 virtual viewpoints, and combining the corresponding parameter data based on the image , the pixel data and depth data of the frame images of a21 synchronous video frames at the moment of the first frame, the second frame image reconstruction is performed on the path composed of b2 virtual viewpoints, and finally the corresponding multi-angle free-view video data is obtained, wherein, The multi-angle free-view video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time.

可以理解的是，可以将指定帧时刻和虚拟视点进行更细的划分，由此得到更多的不同帧时刻对应的同步视频帧和虚拟视点，上述实施例仅为举例说明，并非对具体实施方式的限制。It can be understood that the specified frame time and virtual viewpoint can be further divided, thereby obtaining more synchronous video frames and virtual viewpoints corresponding to different frame times. limits.

在本发明实施例中，可以采用基于深度图的图像绘制(Depth Image BasedRendering，DIBR)算法，根据所述图像组合相应的参数数据和预设的虚拟视点路径，对预设的帧图像的像素数据和深度数据进行组合渲染，从而实现基于预设的虚拟视点路径的帧图像重建，获得相应的多角度自由视角视频数据。In the embodiment of the present invention, a Depth Image Based Rendering (DIBR) algorithm can be used to combine the corresponding parameter data and the preset virtual viewpoint path according to the image, and perform pixel data of the preset frame image Combined rendering with depth data to realize frame image reconstruction based on the preset virtual viewpoint path, and obtain corresponding multi-angle free-view video data.

S25，将所述多角度自由视角视频数据插入至播放控制设备的待播放数据流并通过播放终端进行播放。S25. Insert the multi-angle free-view video data into the data stream to be played of the playback control device and play it through the playback terminal.

播放控制设备可以将多路视频数据流作为输入，其中，视频数据流可以来自采集阵列中各采集设备，也可以来自其他采集设备。播放控制设备可以根据需要选择一路输入的视频数据流作为待播放数据流，其中，可以选择前述步骤S24获得的多角度自由视角视频数据插入待播放数据流，或者由其他输入接口的视频数据流切换至所述多角度自由视角视频数据的输入接口，播放控制设备将选择的待播放数据流输出至播放终端，即可通过播放终端进行播放，因此用户可以通过播放终端观看到多角度自由视角的视频图像。播放终端可以是电视、手机、平板、电脑等视频播放设备或包含显示屏的电子设备。The playback control device can take multiple video data streams as input, wherein the video data streams can come from each collection device in the collection array, or from other collection devices. The playback control device can select one of the input video data streams as the data stream to be played according to needs, wherein the multi-angle free-view video data obtained in the aforementioned step S24 can be selected to be inserted into the data stream to be played, or switched by video data streams of other input interfaces To the input interface of the multi-angle free viewing angle video data, the playback control device outputs the selected data stream to be played to the playback terminal, which can be played through the playback terminal, so the user can watch the video of the multi-angle free viewing angle through the playback terminal image. The playback terminal may be a video playback device such as a TV, a mobile phone, a tablet, or a computer, or an electronic device including a display screen.

在具体实施中，插入播放控制设备的待播放数据流的多角度自由视角视频数据可以保留在播放终端中，以便于用户进行时移观看，其中，时移可以是用户观看时进行的暂停、后退、快进到当前时刻等操作。In a specific implementation, the multi-angle free viewing angle video data of the data stream to be played inserted into the playback control device can be reserved in the playback terminal, so that the user can watch with time shift, wherein the time shift can be the pause and rewind when the user watches , fast forward to the current moment and other operations.

采用上述数据处理方法，可以使用分布式系统架构中的数据处理设备处理指定视频帧的截取以及服务器对预设帧图像进行截取后的多角度自由视角视频的重建，可以避免在现场布置大量服务器进行处理，也可以避免将采集阵列的采集设备采集到的视频数据流直接上传，因此可以节省大量的传输资源及服务器处理资源，且在网络传输带宽有限的情况下，使得指定视频帧的多角度自由视角视频可以实时重建，实现多角度自由视角视频的低时延播放，减小网络传输带宽的限制，因而可以降低实施成本，减少限制条件，易于实现，满足多角度自由视角视频低时延播放和实时互动的需求。Using the above data processing method, the data processing equipment in the distributed system architecture can be used to process the interception of the specified video frame and the reconstruction of the multi-angle free-view video after the server intercepts the preset frame image, which can avoid arranging a large number of servers on site. It can also avoid directly uploading the video data stream collected by the collection equipment of the collection array, so it can save a lot of transmission resources and server processing resources, and in the case of limited network transmission bandwidth, it makes the multi-angle of the specified video frame free Perspective video can be reconstructed in real time to realize low-latency playback of multi-angle free-view video and reduce the limitation of network transmission bandwidth, so it can reduce implementation costs and restrictions, and is easy to implement, meeting the requirements of low-latency playback of multi-angle free-view video and The need for real-time interaction.

在具体实施中，可以根据所述预设的虚拟视点路径中各虚拟视点的虚拟参数数据以及所述图像组合相应的参数数据之间的关系，将所述图像组合中预设的帧图像的深度数据分别映射至相应的虚拟视点；根据分别映射至相应的虚拟视点的预设帧图像的像素数据和深度数据，以及预设的虚拟视点路径，进行帧图像重建，获得相应的多角度自由视角视频数据。In a specific implementation, according to the relationship between the virtual parameter data of each virtual viewpoint in the preset virtual viewpoint path and the corresponding parameter data of the image combination, the depth of the frame image preset in the image combination The data are respectively mapped to the corresponding virtual viewpoints; according to the pixel data and depth data of the preset frame images respectively mapped to the corresponding virtual viewpoints, as well as the preset virtual viewpoint path, the frame image reconstruction is performed to obtain the corresponding multi-angle free-view video data.

其中，所述虚拟视点的虚拟参数数据可以包括：虚拟观看位置数据和虚拟观看角度数据；所述图像组合相应的参数数据可以包括：采集位置数据和拍摄角度数据等。可以先采用前向映射，进而进行反向映射的方法，得到重建后的图像。Wherein, the virtual parameter data of the virtual view point may include: virtual viewing position data and virtual viewing angle data; the corresponding parameter data of the image combination may include: collection position data, shooting angle data, and the like. The reconstructed image can be obtained by using forward mapping first and then performing reverse mapping.

在具体实施中，采集位置数据和拍摄角度数据可以称作外部参数数据，参数数据还可以包括内部参数数据，所述内部参数数据可以包括采集设备的属性数据，从而可以更加准确地确定映射关系。例如，内部参数数据可以包括畸变数据，由于考虑到畸变因素，可以从空间上进一步准确地确定映射关系。In a specific implementation, the collected position data and shooting angle data may be referred to as external parameter data, and the parameter data may also include internal parameter data, and the internal parameter data may include attribute data of the collection device, so that the mapping relationship can be determined more accurately. For example, the internal parameter data may include distortion data, and since the distortion factor is taken into account, the mapping relationship can be further accurately determined spatially.

在一具体实施例中，为了后续能够方便获取数据，可以基于所述图像组合的像素数据及深度数据，生成所述图像组合相应的拼接图像，所述拼接图像可以包括第一字段和第二字段，其中，所述第一字段包括所述图像组合的像素数据，所述第二字段包括所述图像组合的深度数据，然后，存储所述图像组合相应的拼接图像及相应的参数数据。In a specific embodiment, in order to facilitate subsequent data acquisition, a mosaic image corresponding to the image combination may be generated based on the pixel data and depth data of the image combination, and the mosaic image may include a first field and a second field , wherein the first field includes the pixel data of the image combination, the second field includes the depth data of the image combination, and then stores the corresponding spliced image and corresponding parameter data of the image combination.

在另一具体实施例中，为了节约存储空间，可以基于所述图像组合中预设帧图像的像素数据及深度数据，生成所述图像组合中预设帧图像相应的拼接图像，所述预设帧图像相应的拼接图像可以包括第一字段和第二字段，其中，所述第一字段包括所述预设帧图像的像素数据，所述第二字段包括所述预设帧图像的深度数据，然后，仅存储所述预设帧图像相应的拼接图像及相应的参数数据即可。In another specific embodiment, in order to save storage space, the spliced image corresponding to the preset frame image in the image combination may be generated based on the pixel data and depth data of the preset frame image in the image combination, and the preset The spliced image corresponding to the frame image may include a first field and a second field, wherein the first field includes pixel data of the preset frame image, and the second field includes depth data of the preset frame image, Then, only the spliced image corresponding to the preset frame image and the corresponding parameter data are stored.

其中，所述第一字段与所述第二字段相对应，所述拼接图像可以分为图像区域以及深度图区域，图像区域的像素字段存储所述多个帧图像的像素数据，深度图区域的像素字段存储所述多个帧图像的深度数据；所述图像区域中存储帧图像的像素数据的像素字段作为所述第一字段，所述深度图区域中存储帧图像的深度数据的像素字段作为所述第二字段；获取的图像组合的拼接图像和及所述图像组合相应的参数数据可以存入数据文件中，当需要获取拼接图像或相应的参数数据时，可以根据数据文件的头文件中包含的存储地址，从相应的存储空间中读取。Wherein, the first field corresponds to the second field, the spliced image can be divided into an image area and a depth map area, the pixel field of the image area stores the pixel data of the plurality of frame images, and the pixel data of the depth map area The pixel field stores the depth data of the plurality of frame images; the pixel field storing the pixel data of the frame image in the image area is used as the first field, and the pixel field storing the depth data of the frame image in the depth map area is used as The second field: the spliced image of the obtained image combination and the corresponding parameter data of the image combination can be stored in the data file. When it is necessary to obtain the spliced image or corresponding parameter data, according to the header file of the data file The contained memory address is read from the corresponding memory space.

此外，图像组合的存储格式可以为视频格式，图像组合的数量可以是多个，每个图像组合可以是对视频进行解封装和解码后，对应不同帧时刻的图像组合。In addition, the storage format of the image combination may be a video format, and the number of image combinations may be multiple, and each image combination may be an image combination corresponding to a different frame time after the video is decapsulated and decoded.

在具体实施中，可以基于收到的来自交互终端的图像重建指令，确定交互时刻的交互帧时刻信息，将存储的对应交互帧时刻的相应的图像组合预设帧图像的拼接图像及相应图像组合对应的参数数据发送至所述交互终端，使得所述交互终端基于交互操作所确定的虚拟视点位置信息，按照预设规则选择所述拼接图像中相应的像素数据和深度数据及对应的参数数据，将选择的像素数据和深度数据与所述参数数据进行组合渲染，重建得到所述待交互的虚拟视点位置对应的多角度自由视角视频数据并进行播放。In a specific implementation, based on the image reconstruction instruction received from the interactive terminal, the interactive frame time information at the interactive time can be determined, and the corresponding stored image at the corresponding interactive frame time can be combined with the spliced image of the preset frame image and the corresponding image combination The corresponding parameter data is sent to the interactive terminal, so that the interactive terminal selects the corresponding pixel data, depth data and corresponding parameter data in the mosaic image according to preset rules based on the virtual viewpoint position information determined by the interactive operation, Combining and rendering the selected pixel data and depth data with the parameter data, reconstructing and playing multi-angle free-view video data corresponding to the virtual viewpoint position to be interacted.

其中，所述预设规则可以根据具体情景来设定，比如，可以基于交互操作确定的虚拟视点位置信息，选择按距离排序最靠近交互时刻的虚拟视点的W个临近的虚拟视点的位置信息，并在拼接图像中获取包括交互时刻的虚拟视点的上述共W+1个虚拟视点对应的满足交互帧时刻信息的像素数据和深度数据。Wherein, the preset rule can be set according to specific scenarios, for example, based on the virtual viewpoint location information determined by the interactive operation, the location information of W adjacent virtual viewpoints sorted by distance closest to the virtual viewpoint at the moment of interaction can be selected, And in the spliced image, the pixel data and depth data corresponding to the W+1 virtual viewpoints including the virtual viewpoints at the interaction time that satisfy the time information of the interaction frame are acquired.

其中，所述交互帧时刻信息基于来自交互终端的触发操作确定，所述触发操作可以是用户输入的触发操作，也可以是交互终端自动生成的触发操作，例如，交互终端在检测到存在多角度自由视点数据帧的标识时可以自动发起触发操作。在用户手动触发时，可以是交互终端显示交互提示信息后用户选择触发交互的时刻信息，也可以是交互终端接收到用户操作触发交互的历史时刻信息，所述历史时刻信息可以为位于当前播放时刻之前的时刻信息。Wherein, the interaction frame time information is determined based on a trigger operation from the interactive terminal. The trigger operation may be a trigger operation input by the user, or may be a trigger operation automatically generated by the interactive terminal. For example, when the interactive terminal detects that there is a multi-angle The trigger operation can be initiated automatically when the free viewpoint data frame is identified. When the user manually triggers, it may be the time information when the user chooses to trigger the interaction after the interactive terminal displays the interactive prompt information, or it may be the historical time information that the interactive terminal receives the user operation to trigger the interaction, and the historical time information may be at the current playback time previous time information.

在具体实施中，所述交互终端可以基于获取的交互帧时刻的图像组合中预设帧图像的拼接图像及对应的参数数据，交互帧时刻信息以及交互帧时刻的虚拟视点位置信息，采用与上述步骤S24相同的方法对获取的交互帧时刻的图像组合中预设帧图像的拼接图像的像素数据和深度数据进行组合渲染，获得所述交互的虚拟视点位置对应的多角度自由视角视频数据，并在所述交互的虚拟视点位置开始播放多角度自由视角视频。In a specific implementation, the interactive terminal may be based on the spliced image of the preset frame image and the corresponding parameter data in the acquired image combination at the interaction frame time, the information of the interaction frame time and the virtual viewpoint position information at the interaction frame time, using the same method as above In step S24, the same method is used to combine and render the pixel data and depth data of the spliced image of the preset frame image in the acquired image combination at the time of the interactive frame, to obtain the multi-angle free-view video data corresponding to the virtual viewpoint position of the interaction, and The multi-angle free-view video starts to be played at the interactive virtual viewpoint position.

采用上述方案，可以基于来自交互终端的图像重建指令随时生成交互的虚拟视点位置对应的多角度自由视角视频数据，可以进一步提升用户互动体验。By adopting the above solution, multi-angle free-view video data corresponding to the interactive virtual view point position can be generated at any time based on the image reconstruction instruction from the interactive terminal, which can further improve user interaction experience.

参照图1所示的数据处理系统的结构示意图，在本发明实施例中，如图1所示，数据处理系统10可以包括：数据处理设备11、服务器12、播放控制设备13以及播放终端14，其中：Referring to the schematic structural diagram of the data processing system shown in FIG. 1, in an embodiment of the present invention, as shown in FIG. in:

所述数据处理设备11，适于基于视频帧截取指令，从所述现场采集区域不同位置实时同步采集的多路视频数据流中对指定帧时刻的视频帧截取得到多个同步视频帧，将获得的所述指定帧时刻的多个同步视频帧上传至所述服务器，其中，多路视频数据流可以是采用压缩格式的视频数据流，也可以是采用非压缩格式的视频数据流；The data processing device 11 is adapted to obtain a plurality of synchronous video frames by intercepting video frames at specified frame times from multiple video data streams synchronously collected in real time at different positions in the on-site collection area based on a video frame interception instruction, and will obtain A plurality of synchronous video frames at the specified frame moment are uploaded to the server, wherein the multiple video data streams may be video data streams in a compressed format, or video data streams in an uncompressed format;

所述服务器12，适于将接收到的所述数据处理设备11上传的指定帧时刻的多个同步视频帧的帧图像作为图像组合，确定所述图像组合相应的参数数据及所述图像组合中各帧图像的深度数据，并基于所述图像组合相应的参数数据、所述图像组合中预设帧图像的像素数据和深度数据，对预设的虚拟视点路径进行帧图像重建，获得相应的多角度自由视角视频数据，所述多角度自由视角视频数据包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据；The server 12 is adapted to use the received frame images of a plurality of synchronous video frames uploaded by the data processing device 11 at a specified frame time as an image combination, and determine the corresponding parameter data of the image combination and the corresponding parameter data in the image combination. The depth data of each frame image, and based on the corresponding parameter data of the image combination, the pixel data and depth data of the preset frame image in the image combination, perform frame image reconstruction on the preset virtual viewpoint path, and obtain the corresponding multiple Angle free viewing angle video data, the multi-angle free viewing angle video data includes: multi-angle free viewing angle space data and multi-angle free viewing angle time data of frame images sorted according to frame time;

所述播放控制设备13，适于将所述多角度自由视角视频数据插入待播放数据流；The playback control device 13 is adapted to insert the multi-angle free-view video data into the data stream to be played;

所述播放终端14，适于接收来自所述播放控制设备13的待播放数据流并进行实时播放。The playback terminal 14 is adapted to receive the data stream to be played from the playback control device 13 and play it in real time.

在具体实施中，播放控制终端13可以基于控制指令输出待播放数据流。作为可选示例，播放控制设备13可以从多路数据流中选择一路作为待播放数据流，或者在多路数据流中不断地切换选择以持续地输出所述待播放数据流。导播控制设备可以作为本发明实施例中的一种播放控制设备。其中导播控制设备可以为基于外部输入控制指令进行播放控制的人工或半人工导播控制设备，也可以为基于人工智能或大数据学习或预设算法能够自动进行导播控制的虚拟导播控制设备。In a specific implementation, the playback control terminal 13 may output the data stream to be played based on the control instruction. As an optional example, the playback control device 13 may select one of the multiple data streams as the data stream to be played, or continuously switch the selection among the multiple data streams to continuously output the data stream to be played. The broadcast control device can be used as a playback control device in the embodiment of the present invention. The broadcast control device can be a manual or semi-manual broadcast control device based on external input control instructions, or a virtual guide control device that can automatically perform broadcast control based on artificial intelligence or big data learning or preset algorithms.

采用上述数据处理系统，可以采用分布式系统架构中的数据处理设备处理指定视频帧的截取以及服务器对预设帧图像进行截取后的多角度自由视角视频的重建，可以避免在现场布置大量服务器进行处理，也可以避免将采集阵列的采集设备采集到的视频数据流直接上传，因此可以节省大量的传输资源及服务器处理资源，且在网络传输带宽有限的情况下，使得指定视频帧的多角度自由视角视频可以实时重建，实现多角度自由视角视频的低时延播放，减小网络传输带宽的限制，因而可以降低实施成本，减少限制条件，易于实现，满足多角度自由视角视频低时延播放和实时互动的需求。Using the above data processing system, the data processing equipment in the distributed system architecture can be used to process the interception of the specified video frame and the reconstruction of the multi-angle free-view video after the server intercepts the preset frame image, which can avoid a large number of servers on site. It can also avoid directly uploading the video data stream collected by the collection equipment of the collection array, so it can save a lot of transmission resources and server processing resources, and in the case of limited network transmission bandwidth, it makes the multi-angle of the specified video frame free Perspective video can be reconstructed in real time to realize low-latency playback of multi-angle free-view video and reduce the limitation of network transmission bandwidth, so it can reduce implementation costs and restrictions, and is easy to implement, meeting the requirements of low-latency playback of multi-angle free-view video and The need for real-time interaction.

在具体实施中，所述服务器12还适于基于所述图像组合中预设帧图像的像素数据和深度数据，生成所述图像组合中预设帧时刻的拼接图像，所述拼接图像包括第一字段和第二字段，其中，所述第一字段包括所述图像组合中预设帧图像的像素数据，所述第二字段包括所述图像组合中预设帧图像的深度数据的第二字段，并存储所述图像组合的拼接图像及所述图像组合相应的参数数据。In a specific implementation, the server 12 is further adapted to generate a mosaic image at a moment of a preset frame in the image combination based on the pixel data and depth data of the preset frame image in the image combination, and the mosaic image includes the first field and a second field, wherein the first field includes the pixel data of the preset frame image in the image combination, and the second field includes the second field of the depth data of the preset frame image in the image combination, And store the spliced image of the image combination and the corresponding parameter data of the image combination.

在具体实施中，所述数据处理系统10还可以包括交互终端15，适于基于触发操作，确定交互帧时刻信息，向服务器发送包含交互帧时刻信息的图像重建指令，接收从所述服务器返回的对应交互帧时刻的图像组合中预设帧图像的拼接图像及对应的参数数据，并基于交互操作确定虚拟视点位置信息，按照预设规则选择所述拼接图像中相应的像素数据和深度数据，基于选择的像素数据和深度数据与所述参数数据进行组合渲染，重建得到交互帧时刻虚拟视点位置对应的多角度自由视角视频数据并进行播放。In a specific implementation, the data processing system 10 may also include an interactive terminal 15, which is adapted to determine the time information of the interactive frame based on the trigger operation, send an image reconstruction instruction containing the time information of the interactive frame to the server, and receive the information returned from the server. The mosaic image of the preset frame image and the corresponding parameter data in the image combination corresponding to the interactive frame moment, and determine the virtual viewpoint position information based on the interactive operation, and select the corresponding pixel data and depth data in the mosaic image according to the preset rules, based on The selected pixel data and depth data are combined with the parameter data for rendering, and the multi-angle free-view video data corresponding to the virtual viewpoint position at the moment of the interactive frame is reconstructed and played.

其中，所述播放终端14的数量可以是一个或多个，所述交互终端15的数量可以是一个或多个，所述播放终端14与所述交互终端15可以为同一终端设备。并且，可以采用服务器、播放控制设备或者交互终端中的至少一种作为视频帧截取指令的发射端，也可以采用其他能够发射视频帧截取指令的设备。Wherein, the number of the playback terminal 14 may be one or more, the number of the interaction terminal 15 may be one or more, and the playback terminal 14 and the interaction terminal 15 may be the same terminal device. In addition, at least one of a server, a playback control device, or an interactive terminal can be used as the transmitting end of the video frame interception instruction, and other devices capable of transmitting the video frame interception instruction can also be used.

需要说明的是，在具体实施中，根据用户需求，所述数据处理设备和所述服务器的位置可以灵活部署。例如，所述数据处理设备可以置于现场非采集区域或云端。又如，所述服务器可以置于现场非采集区域，云端或者终端接入侧，比如，在终端接入侧，基站、机顶盒、路由器、家庭数据中心服务器、热点设备等边缘节点设备均可以作为所述服务器，用以获得多角度自由视角数据。或者，所述数据处理设备和所述服务器也可以集中设置在一起，作为一个服务器集群进行协同工作，实现多角度自由视角数据的快速生成，以实现多角度自由视角视频的低时延播放及实时互动。It should be noted that, in a specific implementation, the locations of the data processing device and the server may be flexibly deployed according to user requirements. For example, the data processing device may be placed in an on-site non-acquisition area or in the cloud. As another example, the server can be placed in the on-site non-collection area, on the cloud or on the terminal access side. For example, on the terminal access side, edge node devices such as base stations, set-top boxes, routers, home data center servers, and hotspot devices can be used as all The above-mentioned server is used to obtain multi-angle free-view data. Alternatively, the data processing device and the server can also be centrally set together to work together as a server cluster to realize rapid generation of multi-angle free-view data, so as to realize low-latency playback and real-time multi-angle free-view video. interactive.

采用上述方案，可以基于来自交互终端的图像重建指令随时生成待交互的虚拟视点位置对应的多角度自由视角视频数据，可以进一步提升用户互动体验。By adopting the above solution, multi-angle free-view video data corresponding to the virtual viewpoint position to be interacted can be generated at any time based on the image reconstruction instruction from the interactive terminal, which can further improve user interaction experience.

为使本领域技术人员更好地理解和实现本发明实施例，以下通过具体的应用场景详细说明数据处理系统。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the data processing system will be described in detail below through specific application scenarios.

如图3所示，为一种应用场景中数据处理系统的结构示意图，其中，示出了一场篮球赛的数据处理系统的布置场景，所述数据处理系统包括由多个采集设备组成的采集阵列31、数据处理设备32、云端的服务器集群33、播放控制设备34，播放终端35和交互终端36。As shown in FIG. 3 , it is a schematic structural diagram of a data processing system in an application scenario, in which it shows a layout scene of a data processing system for a basketball game, and the data processing system includes a collection system composed of multiple collection devices. An array 31 , a data processing device 32 , a cloud server cluster 33 , a playback control device 34 , a playback terminal 35 and an interactive terminal 36 .

参照图3，以左侧的篮球框作为核心看点，以核心看点为圆心，与核心看点位于同一平面的扇形区域作为预设的多角度自由视角范围。所述采集阵列31中各采集设备可以根据所述预设的多角度自由视角范围，成扇形置于现场采集区域不同位置，可以分别从相应角度实时同步采集视频数据流。Referring to Figure 3, take the basketball hoop on the left as the core point of view, take the core point of view as the center of the circle, and the fan-shaped area on the same plane as the core point of view as the preset multi-angle free viewing angle range. Each acquisition device in the acquisition array 31 can be fan-shaped and placed at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and can collect video data streams from corresponding angles in real time and synchronously.

在具体实施中，采集设备还可以设置在篮球场馆的顶棚区域、篮球架上等。各采集设备可以沿直线、扇形、弧线、圆形或者不规则形状排列分布。具体排列方式可以根据具体的现场环境、采集设备数量、采集设备的特点、成像效果需求等一种或多种因素进行设置。所述采集设备可以是任何具有摄像功能的设备，例如，普通的摄像机、手机、专业摄像机等。In a specific implementation, the collection device can also be set on the ceiling area of the basketball court, on the basketball hoop, and the like. Each collection device can be arranged and distributed along a straight line, sector, arc, circle or irregular shape. The specific arrangement can be set according to one or more factors such as the specific on-site environment, the number of acquisition devices, the characteristics of the acquisition devices, and the requirements for imaging effects. The collection device may be any device with a camera function, for example, an ordinary camera, a mobile phone, a professional camera, and the like.

而为了不影响采集设备工作，所述数据处理设备32可以置于现场非采集区域，可视为现场服务器。所述数据处理设备32可以通过无线局域网向所述采集阵列31中各采集设备分别发送拉流指令，所述采集阵列31中各采集设备基于所述数据处理设备32发送的拉流指令，将获得的视频数据流实时传输至所述数据处理设备32。其中，所述采集阵列31中各采集设备可以通过交换机37将获得的视频数据流实时传输至所述数据处理设备32。In order not to affect the work of the acquisition equipment, the data processing equipment 32 can be placed in a non-acquisition area on site, which can be regarded as an on-site server. The data processing device 32 can send a streaming instruction to each collection device in the collection array 31 through a wireless local area network, and each collection device in the collection array 31 will obtain a stream based on the streaming instruction sent by the data processing device 32. The video data stream is transmitted to the data processing device 32 in real time. Wherein, each collection device in the collection array 31 can transmit the obtained video data stream to the data processing device 32 in real time through the switch 37 .

当所述数据处理设备32接收到视频帧截取指令时，从接收到的多路视频数据流中对指定帧时刻的视频帧截取得到多个同步视频帧的帧图像，并将获得的所述指定帧时刻的多个同步视频帧上传至云端的服务器集群33。When the data processing device 32 receives a video frame interception instruction, it intercepts the video frame at the specified frame time from the received multi-channel video data stream to obtain frame images of a plurality of synchronous video frames, and uses the obtained specified A plurality of synchronous video frames at frame times are uploaded to the server cluster 33 in the cloud.

相应地，云端的服务器集群33将接收的多个同步视频帧的帧图像作为图像组合，确定所述图像组合相应的参数数据及所述图像组合中各帧图像的深度数据，并基于所述图像组合相应的参数数据、所述图像组合中预设帧图像的像素数据和深度数据，对预设的虚拟视点路径进行帧图像重建，获得相应的多角度自由视角视频数据，所述多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。Correspondingly, the server cluster 33 in the cloud regards the received frame images of a plurality of synchronous video frames as an image combination, determines the corresponding parameter data of the image combination and the depth data of each frame image in the image combination, and based on the image Combining the corresponding parameter data, the pixel data and depth data of the preset frame image in the image combination, performing frame image reconstruction on the preset virtual viewpoint path, and obtaining corresponding multi-angle free-view video data, the multi-angle free-view The video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time.

服务器可以置于云端，并且为了能够更快速地并行处理数据，可以按照处理数据的不同，由多个不同的服务器或服务器组组成云端的服务器集群33。The server can be placed in the cloud, and in order to process data in parallel more quickly, a server cluster 33 in the cloud can be composed of multiple different servers or server groups according to different data to be processed.

例如，所述云端的服务器集群33可以包括：第一云端服务器331，第二云端服务器332，第三云端服务器333，第四云端服务器334。其中，第一云端服务器331可以用于确定所述图像组合相应的参数数据；第二云端服务器332可以用于确定所述图像组合中各帧图像的深度数据；第三云端服务器333可以基于所述图像组合相应的参数数据、所述图像组合的像素数据和深度数据，使用基于深度图的虚拟视点重建(Depth Image Based Rendering，DIBR)算法，对预设的虚拟视点路径进行帧图像重建；所述第四云端服务器334可以用于生成多角度自由视角视频，其中，所述多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。For example, the cloud server cluster 33 may include: a first cloud server 331 , a second cloud server 332 , a third cloud server 333 , and a fourth cloud server 334 . Wherein, the first cloud server 331 can be used to determine the corresponding parameter data of the image combination; the second cloud server 332 can be used to determine the depth data of each frame image in the image combination; the third cloud server 333 can be based on the The parameter data corresponding to the image combination, the pixel data and the depth data of the image combination, using a depth map-based virtual viewpoint reconstruction (Depth Image Based Rendering, DIBR) algorithm, perform frame image reconstruction on the preset virtual viewpoint path; The fourth cloud server 334 may be used to generate multi-angle free-view video, wherein the multi-angle free-view video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time.

可以理解的是，所述第一云端服务器831、第二云端服务器832、第三云端服务器833、第四云端服务器834也可以为服务器阵列或服务器子集群组成的服务器组，本发明实施例不做限制。It can be understood that, the first cloud server 831, the second cloud server 832, the third cloud server 833, and the fourth cloud server 834 may also be server groups composed of server arrays or server sub-clusters, which are not described in the embodiment of the present invention. limit.

在具体实施中，云端的服务器集群33可以采用如下方式存储所述图像组合的像素数据及深度数据：In specific implementation, the server cluster 33 in the cloud can store the pixel data and depth data of the image combination in the following manner:

基于所述图像组合的像素数据及深度数据，生成对应帧时刻的拼接图像，所述拼接图像包括第一字段和第二字段，其中，所述第一字段包括所述图像组合中预设帧图像的像素数据，所述第二字段包括所述图像组合中预设帧图像的深度数据的第二字段。获取的拼接图像和相应的参数数据可以存入数据文件中，当需要获取拼接图像或参数数据时，可以根据数据文件的头文件中相应的存储地址，从相应的存储空间中读取。Based on the pixel data and depth data of the image combination, a spliced image corresponding to the frame moment is generated, and the spliced image includes a first field and a second field, wherein the first field includes a preset frame image in the image combination pixel data, the second field includes the second field of the depth data of the preset frame image in the image combination. The obtained mosaic image and corresponding parameter data can be stored in the data file, and when the mosaic image or parameter data needs to be obtained, it can be read from the corresponding storage space according to the corresponding storage address in the header file of the data file.

然后，播放控制设备34可以将接收到的所述多角度自由视角视频数据插入待播放数据流中，播放终端35接收来自所述播放控制设备34的待播放数据流并进行实时播放。其中，播放控制设备34可以为人工播放控制设备，也可以为虚拟播放控制设备。在具体实施中，可以设置专门的可以自动切换视频流的服务器作为虚拟播放控制设备进行数据源的控制。导播控制设备如导播台可以作为本发明实施例中的一种播放控制设备。Then, the playback control device 34 may insert the received multi-angle free-view video data into the data stream to be played, and the playback terminal 35 receives the data stream to be played from the playback control device 34 and plays it in real time. Wherein, the playback control device 34 may be a manual playback control device, or a virtual playback control device. In a specific implementation, a special server that can automatically switch video streams can be set as a virtual playback control device to control the data source. A broadcast control device such as a broadcast control station can be used as a playback control device in the embodiment of the present invention.

可以理解的是，所述数据处理设备32可以根据具体情景置于现场非采集区域或云端，所述服务器(集群)和播放控制设备可以根据具体情景置于现场非采集区域，云端或者终端接入侧，本实施例并不用于限制本发明的具体实现和保护范围。It can be understood that the data processing device 32 can be placed in the on-site non-acquisition area or in the cloud according to the specific scenario, and the server (cluster) and playback control device can be placed in the on-site non-acquisition area, cloud or terminal access according to the specific scenario. On the other hand, this embodiment is not intended to limit the specific implementation and protection scope of the present invention.

如图4所示的交互终端的交互界面示意图，交互终端40的交互界面上具有进度条41，结合图3和图4，交互终端40可以将来自所述数据处理设备32接收的指定帧时刻与进度条相关联，可以在进度条41上生成数个互动标识，例如互动标识42和43。其中，进度条41黑色段为已播放部分41a，进度条41空白段为未播放部分41b。The schematic diagram of the interactive interface of the interactive terminal as shown in FIG. The progress bar is associated, and several interaction signs can be generated on the progress bar 41 , such as interaction signs 42 and 43 . Wherein, the black segment of the progress bar 41 is the played part 41a, and the blank segment of the progress bar 41 is the unplayed part 41b.

当交互终端的系统读取到进度条41上相应的互动标识43，交互终端40的界面可以显示交互提示信息。例如，当用户选择操作触发当前的互动标识43时，交互终端40接收到反馈后生成互动标识43相对应的交互帧时刻的图像重建指令，并发送包含交互帧时刻信息的图像重建指令至所述云端的服务器集群33。当用户未选择触发时，交互终端40可以继续读取后续视频数据，所述进度条上已播放部分41a继续前进。用户也可以在观看时选择触发历史互动标识，例如触发进度条上已播放部分41a展示的互动标识42，交互终端40接收到反馈后生成互动标识42对应的交互帧时刻的图像重建指令。When the system of the interactive terminal reads the corresponding interactive identification 43 on the progress bar 41, the interface of the interactive terminal 40 can display interactive prompt information. For example, when the user selects an operation to trigger the current interaction logo 43, the interactive terminal 40 generates an image reconstruction instruction at the interaction frame time corresponding to the interaction logo 43 after receiving the feedback, and sends the image reconstruction instruction including the interaction frame time information to the A server cluster 33 in the cloud. When the user does not select a trigger, the interactive terminal 40 may continue to read subsequent video data, and the played part 41a on the progress bar continues to advance. The user can also choose to trigger the historical interaction logo when watching, for example, trigger the interaction logo 42 displayed in the played part 41a on the progress bar, and the interactive terminal 40 will generate an image reconstruction instruction corresponding to the interaction frame moment of the interaction logo 42 after receiving the feedback.

当云端的服务器集群33收到的来自交互终端40的图像重建指令时，可以提取所述相应图像组合中预设帧图像的拼接图像及相应图像组合相应的参数数据并传输至所述交互终端40。When the server cluster 33 in the cloud receives the image reconstruction instruction from the interactive terminal 40, it can extract the mosaic image of the preset frame image in the corresponding image combination and the corresponding parameter data of the corresponding image combination and transmit it to the interactive terminal 40 .

交互终端40基于触发操作，确定交互帧时刻信息，向服务器发送包含交互帧时刻信息的图像重建指令，接收从云端的服务器集群33返回的对应交互帧时刻的图像组合中预设帧图像的拼接图像及对应的参数数据，并基于交互操作确定虚拟视点位置信息，按照预设规则选择所述拼接图像中相应的像素数据和深度数据及对应的参数数据，将选择的像素数据和深度数据进行组合渲染，重建得到所述交互帧时刻虚拟视点位置对应的多角度自由视角视频数据并进行播放。The interactive terminal 40 determines the time information of the interactive frame based on the trigger operation, sends an image reconstruction instruction including the time information of the interactive frame to the server, and receives the mosaic image of the preset frame image in the image combination corresponding to the time of the interactive frame returned from the server cluster 33 in the cloud and the corresponding parameter data, and determine the position information of the virtual viewpoint based on the interactive operation, select the corresponding pixel data, depth data and corresponding parameter data in the mosaic image according to the preset rules, and perform combined rendering of the selected pixel data and depth data The multi-angle free-view video data corresponding to the virtual viewpoint position at the time of the interaction frame is reconstructed and played.

可以理解的是，所述采集阵列中各采集设备与所述数据处理设备之间可以通过交换机和/或局域网进行连接，播放终端、交互终端数量均可以是一个或多个，所述播放终端与所述交互终端可以为同一终端设备，所述数据处理设备可以根据具体情景置于现场非采集区域或云端，所述服务器可以根据具体情景置于现场非采集区域，云端或者终端接入侧，本实施例并不用于限制本发明的具体实现和保护范围。It can be understood that each acquisition device in the acquisition array can be connected to the data processing device through a switch and/or a local area network, and the number of playback terminals and interactive terminals can be one or more. The interactive terminal can be the same terminal device, the data processing device can be placed in the on-site non-collection area or the cloud according to the specific situation, and the server can be placed in the on-site non-collection area, the cloud or the terminal access side according to the specific situation. The embodiments are not intended to limit the specific implementation and protection scope of the present invention.

本发明实施例还提供了与上述数据处理方法相应的服务器，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a server corresponding to the above data processing method. To enable those skilled in the art to better understand and implement the embodiment of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图5所示的服务器的结构示意图，在本发明实施例中，如图5所示，服务器50可以包括：Referring to the schematic structural diagram of the server shown in FIG. 5, in the embodiment of the present invention, as shown in FIG. 5, the server 50 may include:

数据接收单元51，适于接收所述数据处理设备上传的多个同步视频帧的帧图像作为图像组合；The data receiving unit 51 is adapted to receive frame images of a plurality of synchronous video frames uploaded by the data processing device as an image combination;

参数数据计算单元52，适于确定所述图像组合相应的参数数据；A parameter data calculation unit 52, adapted to determine the corresponding parameter data of the image combination;

深度数据计算单元53，适于确定所述图像组合中各帧图像的深度数据；Depth data calculation unit 53, adapted to determine the depth data of each frame image in the image combination;

视频数据获取单元54，适于基于所述图像组合相应的参数数据、所述图像组合中预设帧图像的像素数据和深度数据，对预设的虚拟视点路径进行帧图像重建，获得相应的多角度自由视角视频数据，其中，所述多角度自由视角视频数据包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。The video data acquisition unit 54 is adapted to perform frame image reconstruction on the preset virtual viewpoint path based on the corresponding parameter data of the image combination, the pixel data and depth data of the preset frame image in the image combination, and obtain corresponding multiple Angle free-view video data, wherein the multi-angle free-view video data includes: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time.

第一数据传输单元55，适于将所述多角度自由视角视频数据插入至播放控制设备的待播放数据流并通过播放终端进行播放。The first data transmission unit 55 is adapted to insert the multi-angle free-view video data into the data stream to be played of the playback control device and play it through the playback terminal.

其中，所述多个同步视频帧可以为所述数据处理设备基于视频帧截取指令，在从现场采集区域不同位置实时同步采集并上传的多路视频数据流中对指定帧时刻的视频帧截取得到，所述多个同步视频帧的拍摄角度不同。Wherein, the plurality of synchronous video frames can be obtained by the data processing device intercepting the video frames at the specified frame time from the multi-channel video data streams synchronously collected and uploaded in real time from different positions in the on-site collection area based on the video frame interception instruction. , the shooting angles of the multiple synchronous video frames are different.

所述服务器可以根据具体情景置于现场非采集区域，云端或者终端接入侧。The server can be placed in the on-site non-acquisition area, on the cloud or on the terminal access side according to specific scenarios.

在具体实施中，如图5所示，所述视频数据获取单元54可以包括：In specific implementation, as shown in Figure 5, the video data acquisition unit 54 may include:

数据映射子单元541，适于根据所述预设的虚拟视点路径中各虚拟视点的虚拟参数数据以及所述图像组合相应的参数数据之间的关系，将所述图像组合中预设的帧图像的深度数据分别映射至相应的虚拟视点；The data mapping subunit 541 is adapted to combine the preset frame images in the image combination according to the relationship between the virtual parameter data of each virtual viewpoint in the preset virtual viewpoint path and the corresponding parameter data of the image combination The depth data of are respectively mapped to the corresponding virtual viewpoints;

数据重建子单元542，适于根据分别映射至相应的虚拟视点的预设帧图像的像素数据和深度数据，以及预设的虚拟视点路径，进行帧图像重建，获得相应的多角度自由视角视频数据。The data reconstruction subunit 542 is adapted to perform frame image reconstruction according to the pixel data and depth data of the preset frame images respectively mapped to the corresponding virtual viewpoints, and the preset virtual viewpoint path, to obtain corresponding multi-angle free-view video data .

在具体实施中，如图5所示，所述服务器50还可以包括：In specific implementation, as shown in Figure 5, the server 50 may also include:

拼接图像生成单元56，适于基于所述图像组合中预设帧图像的像素数据和深度数据，生成所述图像组合相应的拼接图像，所述拼接图像可以包括第一字段和第二字段，其中，所述第一字段包括所述图像组合中预设帧图像的像素数据，所述第二字段包括所述图像组合中预设帧图像的深度数据；The spliced image generating unit 56 is adapted to generate a spliced image corresponding to the image combination based on the pixel data and depth data of the preset frame image in the image combination, and the spliced image may include a first field and a second field, wherein , the first field includes pixel data of a preset frame image in the image combination, and the second field includes depth data of a preset frame image in the image combination;

数据存储单元57，适于存储所述图像组合的拼接图像及所述图像组合相应的参数数据。The data storage unit 57 is adapted to store the spliced image of the image combination and the corresponding parameter data of the image combination.

数据提取单元58，适于基于收到的来自交互终端的图像重建指令，确定交互时刻的交互帧时刻信息，提取对应交互帧时刻的相应图像组合中预设帧图像的拼接图像及相应图像组合相应的参数数据；The data extraction unit 58 is adapted to determine the interaction frame time information at the interaction time based on the image reconstruction instruction received from the interaction terminal, and extract the mosaic image of the preset frame image and the corresponding image combination corresponding to the corresponding image combination at the corresponding interaction frame time. parameter data;

第二数据传输单元59，适于将所述数据提取单元58提取的所述拼接图像以及相应的参数数据传输至所述交互终端，使得所述交互终端基于交互操作所确定的虚拟视点位置信息，按照预设规则选择所述拼接图像中相应的像素数据和深度数据及对应的参数数据，将选择的像素数据和深度数据进行组合渲染，重建得到交互帧时刻的虚拟视点位置对应的多角度自由视角视频数据并进行播放。The second data transmission unit 59 is adapted to transmit the mosaic image and corresponding parameter data extracted by the data extraction unit 58 to the interactive terminal, so that the interactive terminal determines the virtual viewpoint position information based on the interactive operation, Select the corresponding pixel data, depth data and corresponding parameter data in the mosaic image according to preset rules, combine and render the selected pixel data and depth data, and reconstruct the multi-angle free viewing angle corresponding to the virtual viewpoint position at the time of the interactive frame video data and play it.

本发明实施例还提供了一种数据交互方法和数据处理系统，可以通过从播放控制设备实时获取待播放数据流并进行实时播放展示，待播放数据流中各互动标识与所述视频数据的指定帧时刻关联，然后，可以响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，由于所述交互数据可以包括多角度自由视角数据，因此可以基于所述交互数据，对所述指定帧时刻进行多角度自由视角展示。The embodiment of the present invention also provides a data interaction method and a data processing system, which can obtain the data stream to be played in real time from the playback control device and perform real-time playback display, and the interaction identifiers in the data stream to be played and the specified video data Frame moment correlation, then, in response to a trigger operation on an interaction identifier, the interaction data corresponding to the specified frame moment of the interaction identifier can be obtained, since the interaction data can include multi-angle free view data, it can be based on the Interactive data, multi-angle free-view display for the specified frame moment.

采用本发明实施例中的数据交互方案，在播放过程中，可以根据互动标识的触发操作，获取交互数据，进而进行多角度自由视角展示，以提升用户交互体验。以下参照附图，特别针对数据交互方法和数据处理系统通过具体实施例进行详细说明。Using the data interaction scheme in the embodiment of the present invention, during the playback process, interactive data can be obtained according to the trigger operation of the interactive logo, and then multi-angle free-view display can be performed to improve user interaction experience. Referring to the accompanying drawings, the data interaction method and the data processing system will be described in detail below through specific embodiments.

参照图6所示的数据交互方法的流程图，以下通过具体步骤说明本发明实施例所采用的数据交互方法。Referring to the flow chart of the data interaction method shown in FIG. 6 , the following describes the data interaction method adopted in the embodiment of the present invention through specific steps.

S61，从播放控制设备实时获取待播放数据流并进行实时播放展示，所述待播放数据流包括视频数据及互动标识，各互动标识与所述待播放数据流的指定帧时刻关联。S61. Obtain a data stream to be played from the playback control device in real time and perform real-time playback and display. The data stream to be played includes video data and an interaction identifier, and each interaction identifier is associated with a specified frame time of the data stream to be played.

其中，所述指定帧时刻可以以帧为单位，将第N至M帧作为指定帧时刻，N和M为不小于1的整数，且N≤M；或者，所述指定帧时刻也可以以时间为单位，将第X至Y秒作为指定帧时刻，X和Y为正数，且X≤Y。Wherein, the specified frame time may be in units of frames, and frames N to M are used as the specified frame time, N and M are integers not less than 1, and N≤M; or, the specified frame time may also be time The unit is X to Y seconds as the specified frame time, X and Y are positive numbers, and X≤Y.

在具体实施中，所述待播放数据流可以与数个指定帧时刻进行关联，所述播放控制设备可以基于各指定帧时刻的信息，生成各指定帧时刻相应的互动标识，从而在实时播放展示待播放数据流时，可以在指定帧时刻显示相应的互动标识。其中，各互动标识与视频数据可以根据实际情况采用不同的方式进行关联。In a specific implementation, the data stream to be played may be associated with several designated frame times, and the playback control device may generate an interactive identifier corresponding to each designated frame time based on the information of each designated frame time, so as to play and display in real time When the data stream is to be played, a corresponding interactive logo can be displayed at a specified frame time. Wherein, each interaction identifier and video data may be associated in different ways according to actual conditions.

在本发明一实施例中，所述待播放数据流可以包括数个与所述视频数据相应的帧时刻，由于各互动标识也有相应的指定帧时刻，因此，可以匹配各互动标识相应的指定帧时刻的信息和所述待播放数据流中的各帧时刻的信息，可以将相同信息的帧时刻与互动标识进行关联，从而在实时播放展示待播放数据流并进行到相应帧时刻的时候，可以显示相应的互动标识。In an embodiment of the present invention, the data stream to be played may include several frame times corresponding to the video data, since each interaction logo also has a corresponding designated frame time, therefore, the corresponding designated frame of each interaction logo can be matched The time information and the information of each frame time in the data stream to be played can associate the frame time of the same information with the interaction identifier, so that when the data stream to be played is displayed in real time and the corresponding frame time is reached, it can be Display the corresponding interaction logo.

例如，所述待播放数据流包括N个帧时刻，所述播放控制设备基于M个指定帧时刻的信息，生成相应的M个互动标识。若第i个帧时刻的信息与第j个指定帧时刻的信息相同，则可以将第i个帧时刻与第j个互动标识进行关联，在实时播放展示进行到第i个帧时刻的时候，可以显示第j个互动标识，其中，i为不大于N的自然数，j为不大于M的自然数。For example, the data stream to be played includes N frame times, and the playback control device generates corresponding M interaction identifiers based on information of M specified frame times. If the information at the i-th frame time is the same as the information at the j-th specified frame time, you can associate the i-th frame time with the j-th interaction identifier, and when the real-time playback display reaches the i-th frame time, The jth interactive logo can be displayed, wherein i is a natural number not greater than N, and j is a natural number not greater than M.

S62，响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，所述交互数据包括多角度自由视角数据。S62. In response to a trigger operation on an interaction sign, acquire interaction data corresponding to a specified frame moment of the interaction sign, where the interaction data includes multi-angle free view data.

在具体实施中，分别对应于各指定帧时刻的各交互数据可以存储于预设的存储设备中，由于互动标识和指定帧时刻有对应关系，因此，通过对交互终端执行触发操作，从而可以触发交互终端展示的互动标识，根据对互动标识的触发操作，可以获得与触发的互动标识相对应的指定帧时刻。由此，可以获取与触发的互动标识相对应的指定帧时刻的交互数据。In a specific implementation, each interaction data respectively corresponding to each designated frame time can be stored in a preset storage device. Since the interaction logo has a corresponding relationship with the designated frame time, by performing a trigger operation on the interactive terminal, it can be triggered The interactive logo displayed by the interactive terminal can obtain the specified frame time corresponding to the triggered interactive logo according to the trigger operation on the interactive logo. In this way, the interaction data at the specified frame time corresponding to the triggered interaction identifier can be acquired.

例如，预设的存储设备可以存储有M份交互数据，其中，M份交互数据分别对应于M个指定帧时刻，并且，M个指定帧时刻对应于M个互动标识。假设触发的互动标识为Pi，则根据被触发的互动标识Pi可以获得互动标识Pi相对应的指定帧时刻Ti。由此，获取与得互动标识Pi相对应的指定帧时刻Ti的交互数据。其中，i为自然数。For example, the preset storage device may store M pieces of interaction data, wherein the M pieces of interaction data respectively correspond to M specified frame times, and the M specified frame times correspond to M interaction identifiers. Assuming that the triggered interaction identifier is Pi, the specified frame time Ti corresponding to the interaction identifier Pi can be obtained according to the triggered interaction identifier Pi. Thus, the interaction data of the specified frame time Ti corresponding to the interaction identifier Pi is acquired. Among them, i is a natural number.

其中，所述触发操作可以是用户输入的触发操作，也可以是交互终端自动生成的触发操作。Wherein, the trigger operation may be a trigger operation input by the user, or may be a trigger operation automatically generated by the interactive terminal.

并且，预设的存储设备可以置于现场非采集区域，云端或终端接入侧。具体而言，预设的存储设备可以是本发明实施例中的数据处理设备、服务器或交互终端，或者为位于交互终端侧的边缘节点设备，如基站、机顶盒、路由器、家庭数据中心服务器、热点设备等。Moreover, the preset storage device can be placed in the on-site non-acquisition area, the cloud or the terminal access side. Specifically, the preset storage device may be a data processing device, a server, or an interactive terminal in the embodiment of the present invention, or an edge node device located on the interactive terminal side, such as a base station, a set-top box, a router, a home data center server, a hotspot equipment etc.

S63，基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示。S63. Based on the interaction data, perform multi-angle free-view image presentation at the specified frame time.

在具体实施中，可以采用图像重建算法对所述交互数据的多角度自由视角数据进行图像重建，然后进行所述指定帧时刻的多角度自由视角的图像展示。In a specific implementation, an image reconstruction algorithm may be used to perform image reconstruction on the multi-angle free-view data of the interaction data, and then perform multi-angle free-view image display at the specified frame moment.

并且，若所述指定帧时刻为一个帧时刻，则可以展示多角度自由视角的静态图像；若所述指定帧时刻对应多个帧时刻，则可以展示多角度自由视角的动态图像。Moreover, if the specified frame time is one frame time, then a static image of multi-angle free viewing angles can be displayed; if the specified frame time corresponds to multiple frame times, then a dynamic image of multi-angle free viewing angles can be displayed.

采用上述方案，在视频播放的过程中，可以根据对互动标识的触发操作，获取交互数据，进而进行多角度自由视角展示，以提升用户交互体验。By adopting the above scheme, during the video playing process, the interactive data can be obtained according to the trigger operation on the interactive logo, and then multi-angle free-view display can be performed, so as to improve the user interactive experience.

在具体实施中，所述多角度自由视角数据可以基于接收的所述指定帧时刻对应的多个帧图像生成，所述多个帧图像由数据处理设备对采集阵列中多个采集设备同步采集的多路视频数据流在所述指定帧时刻进行截取得到，所述多角度自由视角数据可以包括所述多个帧图像的像素数据、深度数据，以及参数数据，其中每个帧图像的像素数据以及深度数据之间存在关联关系。In a specific implementation, the multi-angle free viewing angle data may be generated based on multiple received frame images corresponding to the specified frame time, and the multiple frame images are synchronously collected by the data processing device for multiple acquisition devices in the acquisition array Multiple video data streams are obtained by intercepting at the specified frame time, and the multi-angle free view data may include pixel data, depth data, and parameter data of the multiple frame images, wherein the pixel data of each frame image and There is an association relationship between the depth data.

其中，帧图像的像素数据可以为YUV数据或RGB数据中任意一种，或者也可以是其它能够对帧图像进行表达的数据。深度数据可以包括与帧图像的像素数据一一对应的深度值，或者，可以是对与帧图像的像素数据一一对应的深度值集合中选取的部分数值。深度数据的具体选取方式根据具体的情景而定。Wherein, the pixel data of the frame image may be any one of YUV data or RGB data, or may be other data capable of expressing the frame image. The depth data may include depth values corresponding one-to-one to the pixel data of the frame image, or may be partial values selected from a set of depth values corresponding to the pixel data of the frame image one-to-one. The specific selection method of depth data depends on the specific scenario.

在具体实施中，可以通过参数矩阵来获得所述多个帧图像相应的参数数据，所述参数矩阵可以包括内参矩阵，外参矩阵、旋转矩阵和平移矩阵等。由此，可以确定空间物体表面指定点的三维几何位置与其在多个帧图像中对应点之间的相互关系。In a specific implementation, parameter data corresponding to the plurality of frame images may be obtained through a parameter matrix, and the parameter matrix may include an internal parameter matrix, an external parameter matrix, a rotation matrix, and a translation matrix. Thus, the relationship between the three-dimensional geometric position of the specified point on the surface of the space object and its corresponding point in multiple frame images can be determined.

在本发明的实施例中，可以采用SFM算法，基于参数矩阵，对获取到的多个帧图像进行特征提取、特征匹配和全局优化，获得的参数估计值作为多个帧图像相应的参数数据。特征提取、特征匹配和全局优化过程中所使用的具体算法可以参见前文介绍。In the embodiment of the present invention, the SFM algorithm can be used to perform feature extraction, feature matching and global optimization on the acquired multiple frame images based on the parameter matrix, and the obtained parameter estimates are used as the corresponding parameter data of the multiple frame images. The specific algorithms used in the process of feature extraction, feature matching and global optimization can be found in the previous introduction.

在具体实施中，可以基于所述多个帧图像，确定各帧图像的深度数据。其中，深度数据可以包括与各帧图像的像素对应的深度值。采集点到现场中各个点的距离可以作为上述深度值，深度值可以直接反映待观看区域中可见表面的几何形状。例如，以拍摄坐标系的原点作为光心，深度值可以是现场中各个点沿着拍摄光轴到光心的距离。本领域技术人员可以理解的是，上述距离可以是相对数值，多个帧图像可以采用相同的基准。In a specific implementation, the depth data of each frame image may be determined based on the plurality of frame images. Wherein, the depth data may include depth values corresponding to pixels of each frame image. The distance from the collection point to each point in the scene can be used as the above-mentioned depth value, and the depth value can directly reflect the geometric shape of the visible surface in the area to be viewed. For example, taking the origin of the shooting coordinate system as the optical center, the depth value can be the distance from each point in the scene to the optical center along the shooting optical axis. Those skilled in the art can understand that the above distance may be a relative value, and multiple frame images may use the same reference.

在本发明一实施例中，可以采用双目立体视觉的算法，计算各帧图像来的深度数据。除此之外，深度数据还可以通过对帧图像的光度特征、明暗特征等特征进行分析间接估算得到。In an embodiment of the present invention, a binocular stereo vision algorithm may be used to calculate depth data from each frame of image. In addition, the depth data can also be indirectly estimated by analyzing the photometric features, light and dark features and other features of the frame image.

在本发明另一实施例中，可以采用MVS算法进行帧图像重建，可以对每个帧图像的像素点都进行匹配，重建每个像素点的三维坐标，获得具有图像一致性的点，然后计算各个帧图像的深度数据。或者，可以对选取的帧图像的像素点进行匹配，重建各选取的帧图像的像素点的三维坐标，获得具有图像一致性的点，然后计算相应帧图像的深度数据。其中，帧图像的像素数据与计算得到的深度数据对应，选取帧图像的方式可以根据具体情景来设定，比如，可以根据需要计算深度数据的帧图像与其他帧图像之间的距离，选择部分帧图像。In another embodiment of the present invention, the MVS algorithm can be used for frame image reconstruction, and the pixels of each frame image can be matched, and the three-dimensional coordinates of each pixel can be reconstructed to obtain points with image consistency, and then calculate Depth data of each frame image. Alternatively, the pixel points of the selected frame images can be matched, and the three-dimensional coordinates of the pixel points of each selected frame image can be reconstructed to obtain points with image consistency, and then the depth data of the corresponding frame images can be calculated. Among them, the pixel data of the frame image corresponds to the calculated depth data, the method of selecting the frame image can be set according to the specific situation, for example, the distance between the frame image of the depth data and other frame images can be calculated according to the needs, and the selected part frame image.

在具体实施中，所述数据处理设备可以基于接收到的视频帧截取指令，截取所述多路视频数据流中所述指定帧时刻的帧级同步的视频帧。In a specific implementation, the data processing device may, based on the received video frame interception instruction, intercept the frame-level synchronized video frame at the specified frame moment in the multiple video data streams.

在具体实施中，所述视频帧截取指令可以包括用于截取视频帧的帧时刻信息，所述数据处理设备根据所述视频帧截取指令中的帧时刻信息，从多路视频数据流中截取相应帧时刻的视频帧。并且，数据处理设备将所述视频帧截取指令中的帧时刻信息发送至所述播放控制设备，所述播放控制设备根据接收的帧时刻信息可以获得对应的指定帧时刻，并根据接收的帧时刻信息生成相应的互动标识。In a specific implementation, the video frame interception instruction may include frame time information for intercepting video frames, and the data processing device intercepts corresponding video frames from multiple video data streams according to the frame time information in the video frame interception instruction. The video frame at frame time. Moreover, the data processing device sends the frame time information in the video frame interception instruction to the playback control device, and the playback control device can obtain the corresponding specified frame time according to the received frame time information, and according to the received frame time information The information generates a corresponding interactive logo.

在具体实施中，所述采集阵列中多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，所述数据处理设备可以置于现场非采集区域或云端。In a specific implementation, multiple acquisition devices in the acquisition array are placed in different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and the data processing equipment can be placed in the on-site non-acquisition area or in the cloud.

在具体实施中，所述多角度自由视角可以是指使得场景能够自由切换的虚拟视点的空间位置以及视角。例如，多角度自由视角可以是6自由度(6DoF)的视角，其中，虚拟视点的空间位置可以表示为(x，y，z)，视角可以表示为三个旋转方向共6个自由度方向，作为6自由度(6DoF)的视角。In a specific implementation, the multi-angle free viewing angle may refer to the spatial position and viewing angle of the virtual viewing point enabling the scene to be switched freely. For example, the multi-angle free viewing angle can be a viewing angle of 6 degrees of freedom (6DoF), where the spatial position of the virtual viewing point can be expressed as (x, y, z), and the viewing angle can be expressed as three rotation directions A total of 6 degrees of freedom are used as the viewing angle of 6 degrees of freedom (6DoF).

并且，多角度自由视角范围可以根据应用场景的需要确定。Moreover, the range of multi-angle free viewing angles can be determined according to the needs of application scenarios.

在具体实施中，所述播放控制设备可以基于来自数据处理设备的截取视频帧的帧时刻的信息，生成与所述待播放数据流中对应时刻的视频帧关联的互动标识。例如，所述数据处理设备接收到视频帧截取指令后，将所述视频帧截取指令中的帧时刻信息发送至所述播放控制设备。然后，所述播放控制设备可以基于各帧时刻信息，生成相应的互动标识。In a specific implementation, the playback control device may generate an interaction identifier associated with a video frame at a corresponding time in the data stream to be played based on the frame time information of the intercepted video frame from the data processing device. For example, after the data processing device receives the video frame clipping instruction, it sends the frame time information in the video frame clipping instruction to the playback control device. Then, the playback control device may generate a corresponding interaction identifier based on the time information of each frame.

在具体实施中，根据现场展示的对象，以及展示对象的关联信息等，可以生成相应的交互数据。例如，所述交互数据还可以包括以下至少一种：现场分析数据、采集对象的信息数据、与采集对象关联的装备的信息数据、现场部署的物品的信息数据、现场展示的徽标的信息数据。然后，基于所述交互数据，进行多角度自由视角展示，可以向用户通过多角度自由视角展示更加丰富的交互信息，从而可以进一步增强用户交互体验。In a specific implementation, corresponding interaction data may be generated according to the objects displayed on site and the associated information of the displayed objects. For example, the interaction data may also include at least one of the following: on-site analysis data, information data of the collection object, information data of equipment associated with the collection object, information data of items deployed on-site, and information data of logos displayed on-site. Then, based on the interaction data, a multi-angle free-view display is performed, and richer interactive information can be displayed to the user through the multi-angle free view, thereby further enhancing the user interaction experience.

例如，在进行篮球比赛播放时，交互数据除了可以包括多角度自由视角数据外，还可以包括球赛的分析数据、某一球员的信息数据、球员所穿的鞋子的信息数据、篮球的信息数据、现场赞助商的徽标的信息数据等其中一种或多种。For example, when playing a basketball game, the interactive data may not only include multi-angle free view data, but also include analysis data of the game, information data of a certain player, information data of shoes worn by the player, information data of basketball, One or more of information data such as logos of on-site sponsors.

在具体实施中，为了在图像展示结束了可以便捷地返回待播放数据流，继续参照图6，在所述步骤63之后，还可以包括：In a specific implementation, in order to return to the data stream to be played conveniently after the image presentation is over, continue referring to FIG. 6 , after the step 63, it may also include:

S64，在检测到交互结束信号时，切换至从所述播放控制设备实时获取待播放数据流并进行实时播放展示。S64. When the interaction end signal is detected, switch to acquiring the data stream to be played in real time from the playback control device and perform real-time playback and presentation.

例如，在接收到交互结束操作指示时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示。For example, when receiving an interaction end operation indication, switch to the data stream to be played obtained in real time from the playback control device and perform real-time playback and presentation.

又例如，在检测到所述指定帧时刻的多角度自由视角的图像展示至最后一幅图像时，切换至从所述播放控制设备实时获取的待播放数据流并进行实时播放展示。For another example, when it is detected that the last image displayed by the multi-angle free-view image at the specified frame time is displayed, switch to the data stream to be played obtained in real time from the playback control device and perform real-time playback and display.

在一具体实施例中，步骤63所述的基于所述交互数据，进行多角度自由视角图像展示，具体可以包括如下步骤：In a specific embodiment, the display of multi-angle free-view images based on the interaction data described in step 63 may specifically include the following steps:

根据所述交互操作确定虚拟视点，所述虚拟视点选自多角度自由视角范围，所述多角度自由视角范围为支持对待观看区域进行虚拟视点的切换观看的范围，然后，展示基于所述虚拟视点对所述待观看区域进行观看的图像，所述图像基于所述交互数据以及所述虚拟视点生成。Determine the virtual viewpoint according to the interactive operation, the virtual viewpoint is selected from the multi-angle free viewing angle range, and the multi-angle free viewing angle range is a range that supports switching viewing of the virtual viewpoint in the area to be viewed, and then, the display based on the virtual viewpoint An image viewed on the area to be viewed, where the image is generated based on the interaction data and the virtual viewpoint.

在具体实施时，可以预设虚拟视点路径，所述虚拟视点路径可以包括数个虚拟视点。由于所述虚拟视点选自多角度自由视角范围，因此，根据交互操作时播放展示的图像视角可以确定相应的第一虚拟视点，然后，可以从所述第一虚拟视点开始按照预设的虚拟视点的顺序，依次展示各虚拟视点相应的图像。During specific implementation, a virtual viewpoint path may be preset, and the virtual viewpoint path may include several virtual viewpoints. Since the virtual viewpoint is selected from the multi-angle free viewing angle range, the corresponding first virtual viewpoint can be determined according to the image viewing angle displayed during interactive operation, and then the preset virtual viewpoint can be started from the first virtual viewpoint In the order of , the corresponding images of each virtual viewpoint are displayed in turn.

在本发明实施例中，可以采用DIBR算法，根据所述多角度自由视角数据中的参数数据和预设的虚拟视点路径，对触发的互动标识的指定帧时刻对应的像素数据和深度数据进行组合渲染，从而实现基于预设的虚拟视点路径的图像重建，获得相应的多角度自由视角视频数据，进而可以从所述第一虚拟视点开始按照预设的虚拟视点的顺序，依次展示相应的图像。In the embodiment of the present invention, the DIBR algorithm can be used to combine the pixel data and depth data corresponding to the specified frame time of the triggered interactive logo according to the parameter data in the multi-angle free viewing angle data and the preset virtual viewpoint path Rendering, so as to realize image reconstruction based on a preset virtual viewpoint path, obtain corresponding multi-angle free-view video data, and then display corresponding images sequentially from the first virtual viewpoint in the order of preset virtual viewpoints.

并且，若所述指定帧时刻对应同一帧时刻，获得的多角度自由视角视频数据可以包括按照帧时刻排序的图像的多角度自由视角空间数据，可以展示多角度自由视角的静态图像；若所述指定帧时刻对应不同的帧时刻，获得的多角度自由视角视频数据可以包括按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据，可以展示多角度自由视角的动态图像，即展示的是多角度自由视角的视频帧的帧图像。And, if the specified frame time corresponds to the same frame time, the obtained multi-angle free-view video data may include multi-angle free-view space data of images sorted according to the frame time, and may display a static image of multi-angle free view; if said The specified frame time corresponds to different frame times, and the obtained multi-angle free-view video data can include multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time, which can display dynamic images of multi-angle free view , which shows the frame image of the video frame of the multi-angle free view.

本发明实施例还提供了与上述数据交互方法相应的系统，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a system corresponding to the above-mentioned data interaction method. In order to enable those skilled in the art to better understand and implement the embodiment of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图7所示的数据处理系统的结构示意图，数据处理系统70可以包括：采集阵列71、数据处理设备72、服务器73、播放控制设备74、以及交互终端75，其中：Referring to the structural representation of the data processing system shown in Figure 7, the data processing system 70 may include: an acquisition array 71, a data processing device 72, a server 73, a playback control device 74, and an interactive terminal 75, wherein:

所述采集阵列71可以包括多个采集设备，所述多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，适于实时同步采集多路视频数据流，并实时上传视频数据流至所述数据处理设备72；The acquisition array 71 may include a plurality of acquisition devices, which are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and are suitable for real-time synchronous acquisition of multiple video data streams, and real-time upload of video data flow to said data processing device 72;

所述数据处理设备72，对于上传的多路视频数据流，适于根据接收到的视频帧截取指令，在指定帧时刻对所述多路视频数据流进行截取，得到对应所述指定帧时刻的多个帧图像以及对应所述指定帧时刻的帧时刻信息，并将所述指定帧时刻的多个帧图像及对应所述指定帧时刻的帧时刻信息上传至所述服务器73，将所述指定帧时刻的帧时刻信息发送至所述播放控制设备74；The data processing device 72 is suitable for intercepting the multi-channel video data stream at a specified frame time according to the received video frame interception instruction for the uploaded multiple video data streams, and obtains the video data corresponding to the specified frame time. A plurality of frame images and frame time information corresponding to the specified frame time, uploading the multiple frame images of the specified frame time and the frame time information corresponding to the specified frame time to the server 73, and uploading the specified The frame time information of the frame time is sent to the playback control device 74;

所述服务器73，适于接收所述数据处理设备72上传的所述多个帧图像以及所述帧时刻信息，并基于所述多个帧图像，生成用于进行交互的交互数据，所述交互数据包括多角度自由视角数据，所述交互数据与所述帧时刻信息关联；The server 73 is adapted to receive the multiple frame images and the frame time information uploaded by the data processing device 72, and generate interaction data for interaction based on the multiple frame images, the interaction The data includes multi-angle free view data, and the interaction data is associated with the frame time information;

所述播放控制设备74，适于确定待播放数据流中与所述数据处理设备72上传的所述帧时刻信息对应的指定帧时刻，生成关联所述指定帧时刻的互动标识，并将包含所述互动标识的待播放数据流传输至所述交互终端75；The playback control device 74 is adapted to determine the specified frame time corresponding to the frame time information uploaded by the data processing device 72 in the data stream to be played, generate an interaction identifier associated with the specified frame time, and include all The data stream to be played of the interactive logo is transmitted to the interactive terminal 75;

所述交互终端75，适于基于接收到的待播放数据流，实时播放展示包含所述互动标识的视频，并基于对所述互动标识的触发操作，获取存储于所述服务器73且对应所述指定帧时刻的交互数据，以进行多角度自由视角图像展示。The interactive terminal 75 is adapted to play and display the video containing the interactive logo in real time based on the received data stream to be played, and based on the trigger operation on the interactive logo, obtain and store in the server 73 and corresponding to the video. Specify interactive data at frame time for multi-angle free-view image display.

采用上述方案，在播放过程中，可以根据互动标识的触发操作，获取交互数据，进而进行多角度自由视角展示，以提升用户交互体验。By adopting the above solution, during the playback process, the interactive data can be obtained according to the trigger operation of the interactive logo, and then multi-angle free-view display can be performed to improve the user interactive experience.

在具体实施中，所述多角度自由视角可以是指使得场景能够自由切换的虚拟视点的空间位置以及视角。并且，多角度自由视角范围可以根据应用场景的需要确定。多角度自由视角可以是6自由度(6DoF)的视角。In a specific implementation, the multi-angle free viewing angle may refer to the spatial position and viewing angle of the virtual viewing point enabling the scene to be switched freely. Moreover, the range of multi-angle free viewing angles can be determined according to the needs of application scenarios. The multi-angle free view may be a view of 6 degrees of freedom (6DoF).

在具体实施中，采集设备本身可以具备编码和封装的功能，从而可以将从相应角度实时同步采集到的原始视频数据进行编码和封装。并且，采集设备可以具备压缩功能。In a specific implementation, the acquisition device itself may have the functions of encoding and encapsulation, so that the original video data collected synchronously in real time from corresponding angles may be encoded and encapsulated. In addition, the collection device may have a compression function.

在具体实施中，所述服务器73适于基于接收到的对应所述指定帧时刻的多个帧图像生成所述多角度自由视角数据，所述多角度自由视角数据包括所述多个帧图像的像素数据、深度数据，以及参数数据，其中每个帧图像的像素数据以及深度数据之间存在关联关系。In a specific implementation, the server 73 is adapted to generate the multi-angle free viewing angle data based on the received multiple frame images corresponding to the specified frame time, and the multi-angle free viewing angle data includes the multiple frame images Pixel data, depth data, and parameter data, wherein there is an association relationship between the pixel data and the depth data of each frame image.

在具体实施中，所述采集阵列71中多个采集设备根据预设的多角度自由视角范围可以置于现场采集区域不同位置，所述数据处理设备72可以置于现场非采集区域或云端，所述服务器73可以置于现场非采集区域、云端或者终端接入侧。In a specific implementation, multiple acquisition devices in the acquisition array 71 can be placed in different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and the data processing device 72 can be placed in the on-site non-acquisition area or in the cloud, so The above-mentioned server 73 can be placed in the on-site non-acquisition area, the cloud or the terminal access side.

在具体实施中，所述播放控制设备74适于基于数据处理设备72截取得到的视频帧的帧信息时刻，生成与所述待播放数据流中对应视频帧关联的互动标识。In a specific implementation, the playback control device 74 is adapted to generate an interaction identifier associated with a corresponding video frame in the to-be-played data stream based on the frame information moment of the video frame intercepted by the data processing device 72 .

在具体实施中，所述交互终端75还适于在检测到交互结束信号时，切换至从所述播放控制设备74实时获取的待播放数据流并进行实时播放展示。In a specific implementation, the interaction terminal 75 is further adapted to switch to the data stream to be played acquired in real time from the playback control device 74 and perform real-time playback and display when detecting the interaction end signal.

为使本领域技术人员更好地理解和实现本发明实施例，以下通过具体的应用场景详细说明数据处理系统，如图8所示，为本发明实施例中另一种应用场景中数据处理系统的结构示意图，示出了一场篮球赛播放应用场景，其中现场为左侧的篮球赛场区域，所述数据处理系统80可以包括：由各采集设备组成的采集阵列81、数据处理设备82、云端的服务器集群83、播放控制设备84和交互终端85。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the data processing system is described in detail below through specific application scenarios, as shown in FIG. 8 , which is a data processing system in another application scenario in the embodiments of the present invention. is a schematic structural diagram of a basketball game playing application scenario, wherein the scene is the basketball court area on the left, and the data processing system 80 may include: an acquisition array 81 composed of various acquisition devices, a data processing device 82, a cloud server cluster 83, play control device 84 and interactive terminal 85.

以篮球框作为核心看点，以核心看点为圆心，与核心看点位于同一平面的扇形区域可以作为预设的多角度自由视角范围。相应地，所述采集阵列81中各采集设备可以根据预设的多角度自由视角范围，成扇形置于现场采集区域不同位置，可以分别从相应角度实时同步采集视频数据流。With the basketball hoop as the core viewing point, the core viewing point as the center, and the fan-shaped area on the same plane as the core viewing point can be used as a preset multi-angle free viewing range. Correspondingly, each acquisition device in the acquisition array 81 can be fan-shaped and placed at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and can collect video data streams from corresponding angles in real time and synchronously.

而为了不影响采集设备工作，所述数据处理设备82可以置于现场非采集区域。所述数据处理设备82可以通过无线局域网向所述采集阵列81中各采集设备分别发送拉流指令。所述采集阵列81中各采集设备基于所述数据处理设备82发送的拉流指令，将获得的视频数据流实时传输至所述数据处理设备82。其中，所述采集阵列81中各采集设备可以通过交换机87将获得的视频数据流实时传输至所述数据处理设备82。各采集设备可以将采集到的原始视频数据实时压缩并实时传输至所述数据处理设备，以进一步节约局域网传输资源。In order not to affect the work of the collection equipment, the data processing equipment 82 can be placed in a non-collection area on site. The data processing device 82 may send streaming instructions to each collection device in the collection array 81 through a wireless local area network. Each acquisition device in the acquisition array 81 transmits the obtained video data stream to the data processing device 82 in real time based on the streaming instruction sent by the data processing device 82 . Wherein, each collection device in the collection array 81 can transmit the obtained video data stream to the data processing device 82 in real time through the switch 87 . Each collection device can compress and transmit the collected raw video data to the data processing device in real time, so as to further save transmission resources of the local area network.

当所述数据处理设备82接收到视频帧截取指令时，从接收到的多路视频数据流中对指定帧时刻的视频帧截取得到多个视频帧对应的帧图像以及对应所述指定帧时刻的帧时刻信息，并将所述指定帧时刻的多个帧图像及对应所述指定帧时刻的帧时刻信息上传至所述云端的服务器集群83，将所述指定帧时刻的帧时刻信息发送至所述播放控制设备84。其中，视频帧截取指令可以为用户手动发出，也可以是数据处理设备自动生成。When the data processing device 82 receives a video frame interception instruction, it intercepts the video frame at the specified frame time from the received multiple video data streams to obtain the frame images corresponding to the multiple video frames and the frame images corresponding to the specified frame time. frame time information, and upload a plurality of frame images at the designated frame time and the frame time information corresponding to the designated frame time to the server cluster 83 in the cloud, and send the frame time information at the designated frame time to the The playback control device 84 described above. Wherein, the video frame interception instruction may be issued manually by the user, or may be automatically generated by the data processing device.

服务器可以置于云端，并且为了能够更快速地并行处理数据，可以按照处理数据的不同，由多个不同的服务器或服务器组组成云端的服务器83。The server can be placed in the cloud, and in order to process data in parallel more quickly, the server 83 in the cloud can be composed of multiple different servers or server groups according to different data to be processed.

例如所述云端的服务器集群83可以包括：第一云端服务器831、第二云端服务器832、第三云端服务器833和第四云端服务器834。其中，第一云端服务器831可以用于确定所述多个帧图像相应的参数数据；第二云端服务器832可以用于确定所述多个帧图像中各帧图像的深度数据；第三云端服务器833可以基于所述多个帧图像相应的参数数据、所述多个帧图像中预设帧图像的深度数据和像素数据，使用DIBR算法，对预设的虚拟视点路径进行帧图像重建；所述第四云端服务器834可以用于生成多角度自由视角视频。For example, the cloud server cluster 83 may include: a first cloud server 831 , a second cloud server 832 , a third cloud server 833 and a fourth cloud server 834 . Wherein, the first cloud server 831 can be used to determine the parameter data corresponding to the plurality of frame images; the second cloud server 832 can be used to determine the depth data of each frame image in the plurality of frame images; the third cloud server 833 Based on the parameter data corresponding to the multiple frame images, the depth data and pixel data of the preset frame images in the multiple frame images, the DIBR algorithm can be used to perform frame image reconstruction on the preset virtual viewpoint path; the first The four-cloud server 834 can be used to generate multi-angle free-view video.

在具体实施中，所述多角度自由视角视频数据可以包括：按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据。所述交互数据可以包括多角度自由视角数据，所述多角度自由视角数据可以包括多个帧图像的像素数据和深度数据以及参数数据，每个帧图像的像素数据以及深度数据之间存在关联关系。In a specific implementation, the multi-angle free-view video data may include: multi-angle free-view space data and multi-angle free-view time data of frame images sorted by frame time. The interaction data may include multi-angle free viewing angle data, and the multi-angle free viewing angle data may include pixel data, depth data and parameter data of multiple frame images, and there is an association between the pixel data and depth data of each frame image .

云端的服务器集群83可以按照所述指定帧时刻信息对交互数据进行存储。The server cluster 83 in the cloud can store the interaction data according to the specified frame time information.

播放控制设备84可以根据数据处理设备上传的帧时刻信息，生成关联所述指定帧时刻的互动标识并将包含所述互动标识的待播放数据流传输至所述交互终端85。The playback control device 84 may generate an interaction identifier associated with the specified frame moment according to the frame time information uploaded by the data processing device, and transmit the to-be-played data stream including the interaction identifier to the interaction terminal 85 .

交互终端85可以基于接收到的待播放数据流，实时播放展示视频并在相应视频帧时刻显示互动标识。当一互动标识被触发，交互终端85可以获取存储于所述云端的服务器集群83且对应所述指定帧时刻的交互数据，以进行多角度自由视角图像展示。交互终端85在检测到交互结束信号时，可以切换至从所述播放控制设备84实时获取待播放数据流并进行实时播放展示。Based on the received data stream to be played, the interactive terminal 85 can play the presentation video in real time and display the interaction logo at the corresponding video frame time. When an interactive sign is triggered, the interactive terminal 85 can obtain the interactive data stored in the server cluster 83 in the cloud and corresponding to the specified frame time, so as to display multi-angle free-view images. When the interaction terminal 85 detects the interaction end signal, it may switch to obtain the data stream to be played from the playback control device 84 in real time and perform real-time playback and presentation.

参照图38所示的另一种数据处理系统的结构示意图，数据处理系统380可以包括：采集阵列381、数据处理设备382、播放控制设备383、以及交互终端384；其中：With reference to the schematic structural diagram of another data processing system shown in FIG. 38, the data processing system 380 may include: an acquisition array 381, a data processing device 382, a playback control device 383, and an interactive terminal 384; wherein:

所述采集阵列381包括多个采集设备，所述多个采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，适于实时同步采集多路视频数据流，并实时上传视频数据流至所述数据处理设备；The acquisition array 381 includes a plurality of acquisition devices, and the plurality of acquisition devices are placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and are suitable for real-time synchronous acquisition of multiple video data streams, and real-time upload of video data flow to said data processing facility;

所述数据处理设备382，对于上传的多路视频数据流，适于根据接收到的视频帧截取指令，在指定帧时刻对所述多路视频数据流进行截取，得到对应所述指定帧时刻的多个帧图像以及对应所述指定帧时刻的帧时刻信息，并将所述指定帧时刻的帧时刻信息发送至所述播放控制设备383；The data processing device 382 is adapted to intercept the multiple video data streams at a specified frame time according to the received video frame interception instruction for the uploaded multiple video data streams, and obtain the video data corresponding to the specified frame time. A plurality of frame images and frame time information corresponding to the designated frame time, and sending the frame time information of the designated frame time to the playback control device 383;

所述播放控制设备383，适于确定待播放数据流中与所述数据处理设备382上传的所述帧时刻信息对应的指定帧时刻，生成关联所述指定帧时刻的互动标识，并将包含所述互动标识的待播放数据流传输至所述交互终端384；The playback control device 383 is adapted to determine the specified frame time corresponding to the frame time information uploaded by the data processing device 382 in the data stream to be played, generate an interaction identifier associated with the specified frame time, and include all The data stream to be played of the interactive logo is transmitted to the interactive terminal 384;

所述交互终端384，适于基于接收到的待播放数据流，实时播放展示包含所述互动标识的视频，并基于对所述互动标识的触发操作，从所述数据处理设备382获取对应于所述互动标识的指定帧时刻的多个帧图像，并基于所述多个帧图像，生成用于进行交互的交互数据，再进行多角度自由视角图像展示，其中，所述交互数据包括多角度自由视角数据。The interactive terminal 384 is adapted to play and display the video containing the interactive logo in real time based on the received data stream to be played, and based on the trigger operation on the interactive logo, obtain from the data processing device 382 the corresponding Multiple frame images at the specified frame time of the interactive logo, and based on the multiple frame images, generate interactive data for interaction, and then perform multi-angle free-view image display, wherein the interactive data includes multi-angle free Perspective data.

在具体实施中，根据用户需求，所述数据处理设备可以灵活部署，例如，所述数据处理设备可以置于现场非采集区域或云端。In a specific implementation, the data processing device can be flexibly deployed according to user requirements, for example, the data processing device can be placed in an on-site non-acquisition area or in the cloud.

采用上述数据处理系统，在播放过程中，可以根据互动标识的触发操作，获取交互数据，进而进行多角度自由视角展示，以提升用户交互体验。Using the above data processing system, during the playback process, interactive data can be obtained according to the trigger operation of the interactive logo, and then multi-angle free-view display can be performed to improve user interactive experience.

本发明实施例还提供了与上述数据交互方法相应的终端，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a terminal corresponding to the above data interaction method. In order to enable those skilled in the art to better understand and implement the embodiment of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图9示出的交互终端的结构示意图，交互终端90可以包括：Referring to the schematic structural diagram of the interactive terminal shown in FIG. 9, the interactive terminal 90 may include:

数据流获取单元91，适于从播放控制设备实时获取待播放数据流，所述待播放数据流包括视频数据及互动标识，所述互动标识与所述待播放数据流的指定帧时刻关联；The data stream obtaining unit 91 is adapted to obtain the data stream to be played in real time from the playback control device, the data stream to be played includes video data and an interaction identifier, and the interaction identifier is associated with a specified frame time of the data stream to be played;

播放展示单元92，适于实时播放展示所述待播放数据流的视频及互动标识；Play display unit 92, adapted to play and display the video and interactive logo of the data stream to be played in real time;

交互数据获取单元93，适于响应于对所述互动标识的触发操作，获取对应于所述指定帧时刻的交互数据，所述交互数据包括多角度自由视角数据；The interaction data acquisition unit 93 is adapted to acquire the interaction data corresponding to the specified frame moment in response to the trigger operation on the interaction identifier, the interaction data including multi-angle free view data;

交互展示单元94，适于基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示；The interactive display unit 94 is adapted to perform multi-angle free-view image display at the specified frame time based on the interactive data;

切换单元95，适于在检测到交互结束信号时，触发切换至由所述数据流获取单元91从所述播放控制设备实时获取的待播放数据流并由所述播放展示单元92进行实时播放展示。The switching unit 95 is adapted to trigger switching to the data stream to be played acquired by the data stream acquisition unit 91 from the playback control device in real time when an interaction end signal is detected, and the playback display unit 92 performs real-time playback display .

其中，所述交互数据可以由服务器生成并传输给交互终端，也可以由交互终端生成。Wherein, the interaction data may be generated by the server and transmitted to the interaction terminal, or may be generated by the interaction terminal.

交互终端在播放视频的过程中，可以从播放控制设备实时获取待播放数据流，在相应的帧时刻的时候，可以显示相应的互动标识。在具体实施中，如图4所示，为本发明实施例中一种交互终端的交互界面示意图。During the process of playing the video, the interactive terminal can obtain the data stream to be played from the playback control device in real time, and can display the corresponding interactive logo at the corresponding frame time. In a specific implementation, as shown in FIG. 4 , it is a schematic diagram of an interactive interface of an interactive terminal in an embodiment of the present invention.

交互终端40从播放控制设备实时获取待播放数据流，在实时播放展示进行到第1个帧时刻T1的时候，可以在进度条41上显示第一个互动标识42，在实时播放展示进行到第二个帧时刻T2的时候，可以在进度条上显示第二个互动标识43。其中，进度条黑色部分为已播放部分，白色为未播放部分。The interactive terminal 40 obtains the data stream to be played in real time from the playback control device. When the real-time playback display reaches the first frame time T1, the first interactive logo 42 can be displayed on the progress bar 41. When the real-time playback display reaches the first frame time T1 At the time T2 of the second frame, the second interaction logo 43 may be displayed on the progress bar. Among them, the black part of the progress bar is the played part, and the white part is the unplayed part.

所述触发操作可以是用户输入的触发操作，也可以是交互终端自动生成的触发操作，例如，交互终端在检测到存在多角度自由视点数据帧的标识时可以自动发起触发操作。在用户手动触发时，可以是交互终端显示交互提示信息后用户选择触发交互的时刻信息，也可以是交互终端接收到用户操作触发交互的历史时刻信息，所述历史时刻信息可以为位于当前播放时刻之前的时刻信息。The trigger operation may be a trigger operation input by the user, or may be a trigger operation automatically generated by the interactive terminal. For example, the interactive terminal may automatically initiate the trigger operation when it detects that there is an identifier of a multi-angle free viewpoint data frame. When the user manually triggers, it may be the time information when the user chooses to trigger the interaction after the interactive terminal displays the interactive prompt information, or it may be the historical time information that the interactive terminal receives the user operation to trigger the interaction, and the historical time information may be at the current playback time previous time information.

结合图4、图7和图9，当交互终端的系统读取到进度条41上相应的互动标识43，可以显示交互提示信息，当用户未选择触发时，交互终端40可以继续读取后续视频数据，进度条41的已播放部分继续前进。当用户选择触发时，交互终端40接收到反馈后生成相应互动标识的指定帧时刻的图像重建指令，并发送至所述服务器73。With reference to Fig. 4, Fig. 7 and Fig. 9, when the system of the interactive terminal reads the corresponding interactive logo 43 on the progress bar 41, it can display interactive prompt information. When the user does not select the trigger, the interactive terminal 40 can continue to read the follow-up video Data, the played part of the progress bar 41 continues to advance. When the user selects a trigger, the interactive terminal 40 generates an image reconstruction instruction at a specified frame time of the corresponding interactive logo after receiving the feedback, and sends it to the server 73 .

例如，当用户选择触发当前的互动标识43时，交互终端40接收到反馈后生成互动标识43相应指定帧时刻T2的图像重建指令，并发送至所述服务器73。所述服务器根据图像重建指令可以发送指定帧时刻T2相应的交互数据。For example, when the user chooses to trigger the current interaction logo 43 , the interaction terminal 40 generates an image reconstruction instruction corresponding to the specified frame time T2 of the interaction logo 43 after receiving the feedback, and sends it to the server 73 . The server may send interactive data corresponding to the specified frame time T2 according to the image reconstruction instruction.

用户也可以在观看时选择触发历史互动标识，例如触发进度条上已播放部分41a展示的互动标识42，交互终端40接收到反馈后生成互动标识42相应指定帧时刻T1的图像重建指令，并发送至所述服务器73。所述服务器根据图像重建指令可以发送指定帧时刻T1相应的交互数据。交互终端40可以采用图像重建算法对所述交互数据的多角度自由视角数据进行图像处理，然后进行所述指定帧时刻的多角度自由视角的图像展示。若所述指定帧时刻为一个帧时刻，则展示的是多角度自由视角的静态图像；若所述指定帧时刻对应多个帧时刻，则展示的是多角度自由视角的动态图像。The user can also choose to trigger the historical interactive logo when watching, for example, trigger the interactive logo 42 shown in the played part 41a on the progress bar. After receiving the feedback, the interactive terminal 40 generates an image reconstruction instruction corresponding to the specified frame time T1 of the interactive logo 42, and sends to the server 73. The server may send interactive data corresponding to the specified frame time T1 according to the image reconstruction instruction. The interactive terminal 40 may use an image reconstruction algorithm to perform image processing on the multi-angle free-view data of the interaction data, and then display the multi-angle free-view images at the specified frame time. If the specified frame time is one frame time, then a static image of multi-angle free viewing angles is displayed; if the specified frame time corresponds to multiple frame times, then a dynamic image of multi-angle free viewing angles is displayed.

结合图4、图38和图9，当交互终端的系统读取到进度条41上相应的互动标识43，可以显示交互提示信息，当用户未选择触发时，交互终端40可以继续读取后续视频数据，进度条41的已播放部分继续前进。当用户选择触发时，交互终端40接收到反馈后生成相应互动标识的指定帧时刻的图像重建指令，并发送至所述数据处理设备382。With reference to Figure 4, Figure 38 and Figure 9, when the system of the interactive terminal reads the corresponding interactive logo 43 on the progress bar 41, it can display interactive prompt information, and when the user does not select the trigger, the interactive terminal 40 can continue to read the follow-up video Data, the played part of the progress bar 41 continues to advance. When the user selects a trigger, the interactive terminal 40 generates an image reconstruction instruction at a specified frame time corresponding to the interactive logo after receiving the feedback, and sends it to the data processing device 382 .

例如，当用户选择触发当前的互动标识43时，交互终端40接收到反馈后生成互动标识43相应指定帧时刻T2的图像重建指令，并发送至所述数据处理设备。所述数据处理设备382根据图像重建指令可以发送指定帧时刻T2相应的多个帧图像。For example, when the user chooses to trigger the current interaction logo 43, the interaction terminal 40 generates an image reconstruction instruction corresponding to the specified frame time T2 of the interaction logo 43 after receiving the feedback, and sends it to the data processing device. The data processing device 382 can send a plurality of frame images corresponding to the specified frame time T2 according to the image reconstruction instruction.

用户也可以在观看时选择触发历史互动标识，例如触发进度条上已播放部分41a展示的互动标识42，交互终端40接收到反馈后生成互动标识42相应指定帧时刻T1的图像重建指令，并发送至所述数据处理设备。所述数据处理设备根据图像重建指令可以发送指定帧时刻T1相应的多个帧图像。The user can also choose to trigger the historical interactive logo when watching, for example, trigger the interactive logo 42 shown in the played part 41a on the progress bar. After receiving the feedback, the interactive terminal 40 generates an image reconstruction instruction corresponding to the specified frame time T1 of the interactive logo 42, and sends to the data processing device. According to the image reconstruction instruction, the data processing device can send a plurality of frame images corresponding to the specified frame time T1.

交互终端40可以基于所述多个帧图像，生成用于进行交互的交互数据，并可以采用图像重建算法对所述交互数据的多角度自由视角数据进行图像处理，然后进行所述指定帧时刻的多角度自由视角的图像展示。若所述指定帧时刻为一个帧时刻，则展示的是多角度自由视角的静态图像；若所述指定帧时刻对应多个帧时刻，则展示的是多角度自由视角的动态图像。The interactive terminal 40 can generate interactive data for interaction based on the plurality of frame images, and can use an image reconstruction algorithm to perform image processing on the multi-angle free view data of the interactive data, and then perform image processing at the specified frame time. Image display from multiple angles and free viewing angles. If the specified frame time is one frame time, then a static image of multi-angle free viewing angles is displayed; if the specified frame time corresponds to multiple frame times, then a dynamic image of multi-angle free viewing angles is displayed.

在具体实施中，本发明实施例的交互终端可以是具有触屏功能的电子设备、头戴式虚拟现实(Virtual Reality，VR)终端、与显示器连接的边缘节点设备、具有显示功能的IoT(The Internet of Things，物联网)设备。In a specific implementation, the interactive terminal in the embodiment of the present invention may be an electronic device with a touch screen function, a head-mounted virtual reality (Virtual Reality, VR) terminal, an edge node device connected to a display, or an IoT (The Internet of Things, Internet of Things) devices.

如图40所示，为本发明实施例中另一种交互终端的交互界面示意图，交互终端为具有触屏功能的电子设备400，当读取到进度条401上相应的互动标识402时，电子设备400的界面可以显示交互提示信息框403。用户可以根据交互提示信息框403的内容进行选择，当用户做出选择“是”的触发操作时，电子设备400接收到反馈后可以生成互动标识402相对应的交互帧时刻的图像重建指令，当用户做出选择“否”的不触发操作时，电子设备400可以继续读取后续视频数据。As shown in FIG. 40 , it is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. The interactive terminal is an electronic device 400 with a touch screen function. When the corresponding interactive logo 402 on the progress bar 401 is read, the electronic The interface of the device 400 may display an interactive prompt message box 403 . The user can make a choice according to the content of the interactive prompt information box 403. When the user makes a trigger operation of selecting "Yes", the electronic device 400 can generate an image reconstruction instruction corresponding to the interactive frame moment of the interactive logo 402 after receiving the feedback. When the user selects "No" to not trigger the operation, the electronic device 400 may continue to read subsequent video data.

如图41所示，为本发明实施例中另一种交互终端的交互界面示意图，交互终端为头戴式VR终端410，当读取到进度条411上相应的互动标识412时，头戴式VR终端410的界面可以显示交互提示信息框413。用户可以根据交互提示信息框413的内容进行选择，当用户做出选择“是”的触发操作(例如点头)时，头戴式VR终端410接收到反馈后可以生成互动标识412相对应的交互帧时刻的图像重建指令，当用户做出选择“否”的不触发操作(例如摇头)时，头戴式VR终端410可以继续读取后续视频数据。As shown in Figure 41, it is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. The interactive terminal is a head-mounted VR terminal 410. When the corresponding interactive logo 412 on the progress bar 411 is read, the head-mounted VR terminal The interface of the VR terminal 410 may display an interactive prompt message box 413 . The user can make a choice according to the content of the interactive prompt information box 413. When the user makes a trigger operation (such as nodding) to select "Yes", the head-mounted VR terminal 410 can generate an interactive frame corresponding to the interactive logo 412 after receiving the feedback. The image reconstruction instruction at the moment, when the user selects "No" and does not trigger the operation (such as shaking the head), the head-mounted VR terminal 410 can continue to read the subsequent video data.

如图42所示，为本发明实施例中另一种交互终端的交互界面示意图，交互终端为与显示器420连接的边缘节点设备421，当边缘节点设备421读取到进度条422上相应的互动标识423时，显示器420可以显示交互提示信息框424。用户可以根据交互提示信息框424的内容进行选择，当用户做出选择“是”的触发操作时，边缘节点设备421接收到反馈后可以生成互动标识423相对应的交互帧时刻的图像重建指令，当用户做出选择“否”的不触发操作时，边缘节点设备421可以继续读取后续视频数据。As shown in FIG. 42 , it is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. The interactive terminal is an edge node device 421 connected to a display 420. When the edge node device 421 reads the corresponding interaction on the progress bar 422 When 423 is identified, the display 420 may display an interactive prompt message box 424 . The user can make a selection according to the content of the interactive prompt information box 424. When the user makes a trigger operation of selecting "Yes", the edge node device 421 can generate an image reconstruction instruction corresponding to the interactive frame time corresponding to the interactive logo 423 after receiving the feedback. When the user selects "No" as an untriggered operation, the edge node device 421 may continue to read subsequent video data.

在具体实施中，交互终端可以与上述的数据处理设备、服务器中至少一种建立通信连接，可以采用有线连接或无线连接。In a specific implementation, the interactive terminal may establish a communication connection with at least one of the above-mentioned data processing device and server, and a wired connection or a wireless connection may be used.

如图43所示，为本发明实施例中一种交互终端的连接示意图。边缘节点设备430通过物联网与交互设备431、432和433建立无线连接。As shown in FIG. 43 , it is a schematic diagram of connection of an interactive terminal in the embodiment of the present invention. The edge node device 430 establishes wireless connections with the interaction devices 431 , 432 and 433 through the Internet of Things.

在具体实施中，交互终端在触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并基于交互操作确定虚拟视点位置信息，如图44所示，为本发明实施例中一种交互终端的交互操作示意图，用户可以在交互操作界面上水平操作或垂直操作，并且操作轨迹可以是直线或曲线。In a specific implementation, after the interactive terminal triggers the interactive logo, it can display the multi-angle free-view image at the specified frame time corresponding to the triggered interactive logo, and determine the position information of the virtual viewpoint based on the interactive operation, as shown in Figure 44. A schematic diagram of an interactive operation of an interactive terminal in an embodiment of the present invention, the user can operate horizontally or vertically on the interactive operation interface, and the operation track can be a straight line or a curve.

在具体实施中，如图45所示，为本发明实施例中另一种交互终端的交互界面示意图。当所述用户点击互动标识后，交互终端获取所述互动标识的指定帧时刻的交互数据。In a specific implementation, as shown in FIG. 45 , it is a schematic diagram of an interactive interface of another interactive terminal in an embodiment of the present invention. After the user clicks on the interaction logo, the interaction terminal acquires the interaction data at the specified frame time of the interaction logo.

若用户并未采取新的操作，则触发操作即为交互操作，可以根据交互操作时播放展示的图像视角确定相应的第一虚拟视点。若用户采取新的操作，则新的操作即为交互操作，可以根据交互操作时播放展示的图像视角确定相应的第一虚拟视点。If the user does not take a new operation, the triggering operation is an interactive operation, and the corresponding first virtual viewpoint can be determined according to the angle of view of the image played and displayed during the interactive operation. If the user takes a new operation, the new operation is an interactive operation, and the corresponding first virtual viewpoint can be determined according to the angle of view of the image played and displayed during the interactive operation.

然后，可以从所述第一虚拟视点开始按照预设的虚拟视点的顺序，依次展示各虚拟视点相应的图像。若所述指定帧时刻对应同一帧时刻，获得的多角度自由视角视频数据可以包括按照帧时刻排序的图像的多角度自由视角空间数据，可以展示多角度自由视角的静态图像；若所述指定帧时刻对应不同的帧时刻，获得的多角度自由视角视频数据可以包括按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据，可以展示多角度自由视角的动态图像，即展示的是多角度自由视角的视频帧的帧图像。Then, starting from the first virtual viewpoint, images corresponding to each virtual viewpoint may be displayed sequentially in accordance with a preset sequence of virtual viewpoints. If the specified frame time corresponds to the same frame time, the obtained multi-angle free-view video data may include multi-angle free-view space data of images sorted according to the frame time, and may display a static image of multi-angle free view; if the specified frame Time corresponds to different frame times, and the obtained multi-angle free-view video data can include multi-angle free-view space data and multi-angle free-view time data of frame images sorted according to frame time, and can display multi-angle free-view dynamic images, namely What is shown is the frame image of the video frame of the multi-angle free view.

在本发明一实施例中，参考图45及46。所述交互终端获得的多角度自由视角视频数据可以包括按照帧时刻排序的帧图像的多角度自由视角空间数据和多角度自由视角时间数据，用户向右水平滑动产生交互操作，确定相应的第一虚拟视点，并且由于不同的虚拟视点可以对应不同的多角度自由视角空间数据和多角度自由视角时间数据，如图46所示，交互界面中展示的帧图像随着交互操作在时间和空间上发生了变化，帧图像展示的内容从图45的运动员奔向终点线变化为图46的运动员即将越过终点线，并且以运动员作为目标对象而言，帧图像展示的视角从左视图变成了正视图。In one embodiment of the invention, refer to FIGS. 45 and 46 . The multi-angle free-view video data obtained by the interactive terminal may include multi-angle free-view space data and multi-angle free-view time data of frame images sorted according to frame time, and the user slides horizontally to the right to generate an interactive operation to determine the corresponding first Virtual viewpoint, and because different virtual viewpoints can correspond to different multi-angle free-view space data and multi-angle free-view time data, as shown in Figure 46, the frame images displayed in the interactive interface occur in time and space with interactive operations The content displayed by the frame image changes from the athlete running towards the finish line in Figure 45 to the athlete who is about to cross the finish line in Figure 46, and with the athlete as the target object, the perspective displayed by the frame image changes from the left view to the front view .

同理可得图45及47，帧图像展示的内容从图45的运动员奔向终点线变化为图47的运动员已经越过终点线，并且以运动员作为目标对象而言，帧图像展示的视角从左视图变成了右视图。In the same way, in Figures 45 and 47, the content displayed by the frame image changes from the athlete running towards the finish line in Figure 45 to the athlete who has crossed the finish line in Figure 47, and with the athlete as the target object, the viewing angle displayed by the frame image is from the left The view becomes the right view.

同理可得图45及48，用户向上垂直滑动产生交互操作，帧图像展示的内容从图45的运动员奔向终点线变化为图48的运动员已经越过终点线，并且以运动员作为目标对象而言，帧图像展示的视角从左视图变成了俯视图。Similarly, in Figures 45 and 48, the user slides vertically upwards to generate an interactive operation, and the content displayed in the frame image changes from the athlete running towards the finish line in Figure 45 to the athlete who has crossed the finish line in Figure 48, and the athlete is the target object , the perspective displayed by the frame image changes from the left view to the top view.

可以理解的是，根据用户的操作可以获得不同的交互操作，并根据交互操作时播放展示的图像视角可以确定相应的第一虚拟视点；根据获得的多角度自由视角视频数据，可以展示多角度自由视角的静态图像或动态图像，本发明实施例不做限制。It can be understood that different interactive operations can be obtained according to the user's operation, and the corresponding first virtual viewpoint can be determined according to the image angle of view displayed during the interactive operation; according to the obtained multi-angle free-view video data, multi-angle free view can be displayed. The static image or the dynamic image of the viewing angle is not limited in this embodiment of the present invention.

在具体实施中，所述交互数据还可以包括以下至少一种：现场分析数据、采集对象的信息数据、与采集对象关联的装备的信息数据、现场部署的物品的信息数据、现场展示的徽标的信息数据。In a specific implementation, the interaction data may also include at least one of the following: on-site analysis data, information data of the collection object, information data of equipment associated with the collection object, information data of items deployed on-site, and logo information displayed on-site. information data.

在本发明一实施例中，如图10所示的本发明实施例中另一种交互终端的交互界面示意图。交互终端100在触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并且，可以在图像(未示出)上叠加现场分析数据，如图10中的现场分析数据101所示。In an embodiment of the present invention, FIG. 10 is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. After the interactive terminal 100 triggers the interactive logo, it can perform multi-angle free-view image display at the specified frame time corresponding to the triggered interactive logo, and can superimpose on-site analysis data on the image (not shown), as shown in FIG. 10 Field analysis data 101 is shown.

在本发明一实施例中，如图11所示的本发明实施例中另一种所述交互终端的交互界面示意图。交互终端110在用户触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并且，可以在图像(未示出)上叠加采集对象的信息数据，如图11中的采集对象的信息数据111所示。In an embodiment of the present invention, FIG. 11 is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. After the user triggers the interactive logo, the interactive terminal 110 can display the multi-angle free-view image at the specified frame time corresponding to the triggered interactive logo, and can superimpose the information data of the collected object on the image (not shown), as shown in Fig. The information data 111 of the acquisition object in 11 is shown.

在本发明一实施例中，如图12所示的本发明实施例中另一种所述交互终端的交互界面示意图。交互终端120在用户触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并且，可以在图像(未示出)上叠加采集对象的信息数据，如图12中的采集对象的信息数据121-123所示。In an embodiment of the present invention, FIG. 12 is a schematic diagram of an interactive interface of another interactive terminal in the embodiment of the present invention. After the user triggers the interactive logo, the interactive terminal 120 can display the multi-angle free-view image at the specified frame time corresponding to the triggered interactive logo, and can superimpose the information data of the collected object on the image (not shown), as shown in Fig. The information data 121-123 of the acquisition object in 12 is shown.

在本发明一实施例中，如图13所示的本发明实施例中另一种所述终端的交互界面示意图。交互终端130在用户触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并且，可以在图像(未示出)上叠加现场部署的物品的信息数据，如图13中的文件包的信息数据131所示。In an embodiment of the present invention, FIG. 13 is a schematic diagram of an interactive interface of another terminal in the embodiment of the present invention. After the user triggers the interactive logo, the interactive terminal 130 can display the multi-angle free-view image at the specified frame time corresponding to the triggered interactive logo, and can superimpose the information data of the items deployed on the scene on the image (not shown), As shown in the information data 131 of the file package in FIG. 13 .

在本发明一实施例中，如图14所示的本发明实施例中另一种所述终端的交互界面示意图。交互终端140在触发交互标识后，可以进行触发的交互标识对应的指定帧时刻的多角度自由视角的图像展示，并且，可以在图像(未示出)上叠加现场展示的徽标的信息数据，如图14中的徽标信息数据141所示。In an embodiment of the present invention, FIG. 14 is a schematic diagram of an interactive interface of another terminal in the embodiment of the present invention. After the interactive terminal 140 triggers the interactive logo, it can display the multi-angle free-view image at the specified frame time corresponding to the triggered interactive logo, and can superimpose the information data of the logo displayed on the spot on the image (not shown), such as The logo information data 141 in FIG. 14 is shown.

由此，用户可以通过交互数据获取更多关联的交互信息，更加深入、全面、专业地了解所观看的内容，从而可以进一步增强用户交互体验。Thus, the user can obtain more associated interactive information through the interactive data, and have a more in-depth, comprehensive and professional understanding of the watched content, thereby further enhancing the user interactive experience.

参照图39示出的另一种交互终端的结构示意图，所述交互终端390可以包括：处理器391，网络组件392，存储器393和显示部件394；其中：Referring to the schematic structural diagram of another interactive terminal shown in FIG. 39, the interactive terminal 390 may include: a processor 391, a network component 392, a memory 393 and a display component 394; wherein:

所述处理器391，适于通过网络组件392实时获取待播放数据流，以及响应于对一互动标识的触发操作，获取对应于所述互动标识的指定帧时刻的交互数据，其中，所述待播放数据流包括视频数据及互动标识，所述互动标识与所述待播放数据流的指定帧时刻关联，所述交互数据包括多角度自由视角数据；The processor 391 is adapted to acquire the data stream to be played in real time through the network component 392, and in response to a trigger operation on an interaction identifier, acquire the interaction data corresponding to the specified frame time of the interaction identifier, wherein the to-be-played The play data stream includes video data and an interaction identifier, the interaction identifier is associated with the specified frame time of the data stream to be played, and the interaction data includes multi-angle free view data;

所述存储器393，适于存储实时获取的待播放数据流；The memory 393 is suitable for storing the data stream to be played acquired in real time;

所述显示部件394，适于基于实时获取的待播放数据流，实时播放展示所述待播放数据流的视频及互动标识，以及基于所述交互数据，进行所述指定帧时刻的多角度自由视角的图像展示。The display unit 394 is adapted to play in real time the video showing the data stream to be played and the interactive logo based on the data stream to be played acquired in real time, and perform multi-angle free viewing at the specified frame time based on the interactive data image display.

其中，交互终端390可以从存储交互数据的服务器处获取得到所述指定帧时刻的交互数据，也可以从存储帧图像的数据处理设备处获取指定帧时刻相应的多个帧图像，然后生成相应的交互数据。Wherein, the interaction terminal 390 can obtain the interaction data at the specified frame time from the server storing the interaction data, or can obtain multiple frame images corresponding to the specified frame time from the data processing device storing the frame images, and then generate corresponding interactive data.

为使本领域技术人员更好地理解和实现本发明实施例，以下对多角度自由视角视频图像在现场侧的处理方案作进一步详细的描述。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the processing scheme of the multi-angle free-view video image at the scene side is further described in detail below.

参照图15所示的数据处理方法的流程图，在本发明实施例中，具体可以包括如下步骤：Referring to the flowchart of the data processing method shown in FIG. 15, in the embodiment of the present invention, the following steps may be specifically included:

S151，在确定采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值时，分别向所述采集阵列中各采集设备发送拉流指令，其中，所述采集阵列中各采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置。S151. When it is determined that the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than the preset bandwidth threshold, respectively send a streaming instruction to each acquisition device in the acquisition array, wherein the Each acquisition device in the acquisition array is placed at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range.

所述多角度自由视角可以是指使得场景能够自由切换的虚拟视点的空间位置以及视角。并且，多角度自由视角范围可以根据应用场景的需要确定。The multi-angle free viewing angle may refer to the spatial position and viewing angle of the virtual viewing point enabling the scene to be switched freely. Moreover, the range of multi-angle free viewing angles can be determined according to the needs of application scenarios.

在具体实施中，预设的带宽阈值可以根据采集阵列中各采集设备所在传输网络的传输能力决定。例如，传输网络的上行带宽为1000Mbps，则预设的带宽值可以为1000Mbps。In a specific implementation, the preset bandwidth threshold may be determined according to the transmission capacity of the transmission network where each collection device in the collection array is located. For example, if the uplink bandwidth of the transmission network is 1000 Mbps, the preset bandwidth value may be 1000 Mbps.

S152，接收所述采集阵列中各采集设备基于所述拉流指令实时传输的压缩视频数据流，所述压缩视频数据流为所述采集阵列中各采集设备分别从相应角度实时同步采集和数据压缩获得。S152. Receive the compressed video data stream transmitted in real time by each acquisition device in the acquisition array based on the streaming command, where the compressed video data stream is the real-time synchronous acquisition and data compression of each acquisition device in the acquisition array from corresponding angles respectively. get.

在具体实施中，采集设备本身可以具备编码和封装的功能，从而可以将从相应角度实时同步采集到的原始视频数据进行编码和封装，其中，采集设备采用的封装格式可以是AVI、QuickTime File Format、MPEG、WMV、Real Video、Flash Video、Matroska等格式中的任一种，或者也可以是其他封装格式，采集设备采用的编码格式可以是H.261、H.263、H.264、H.265、MPEG、AVS等编码格式，或者也可以是其它编码格式。并且，采集设备可以具备压缩功能，压缩率越高，在压缩前数据量相同的情况下可以使得压缩后的数据量更小，可以缓解实时同步传输的带宽压力，因此，采集设备可以采用预测编码、变换编码和熵编码等技术提高视频的压缩率。In specific implementation, the acquisition device itself can have the function of encoding and encapsulation, so that the original video data collected synchronously from the corresponding angle can be encoded and encapsulated in real time, wherein the encapsulation format adopted by the acquisition equipment can be AVI, QuickTime File Format , MPEG, WMV, Real Video, Flash Video, Matroska and other formats, or other encapsulation formats, the encoding format used by the acquisition device can be H.261, H.263, H.264, H. 265, MPEG, AVS and other encoding formats, or other encoding formats. In addition, the acquisition device can have a compression function. The higher the compression rate, the smaller the amount of compressed data when the amount of data before compression is the same, which can relieve the bandwidth pressure of real-time synchronous transmission. Therefore, the acquisition device can use predictive coding , transform coding and entropy coding technologies to improve video compression rate.

采用上述数据处理方法，在拉流前确定了传输带宽是否匹配，可以避免拉流过程中数据传输拥堵，使得各采集设备采集和数据压缩得到的数据能够实时同步传输，加快多角度自由视角视频数据的处理速度，在带宽资源及数据处理资源有限的情况下实现多角度自由视角视频的低时延播放，降低实施成本。Using the above data processing method, it is determined whether the transmission bandwidth matches before pulling the stream, which can avoid data transmission congestion during the pulling process, enable the data collected by each acquisition device and data compression to be transmitted synchronously in real time, and speed up multi-angle free-view video data With high processing speed, low-latency playback of multi-angle free-view video can be realized in the case of limited bandwidth resources and data processing resources, reducing implementation costs.

在具体实施中，可以通过获取各采集设备的参数的数值，进行计算得出采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值。例如，采集阵列中可以包含40个采集设备，各采集设备的压缩视频数据流的码率均为15Mbps，则采集阵列整体的码率为15*40＝600Mbps，若预设的带宽阈值为1000Mbps，则确定采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值。然后，可以根据采集阵列中40个采集设备的IP地址，分别向各采集设备发送拉流指令。In a specific implementation, the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array can be calculated to be no greater than the preset bandwidth threshold by acquiring the values of the parameters of each acquisition device. For example, 40 acquisition devices can be included in the acquisition array, and the code rate of the compressed video data stream of each acquisition device is 15 Mbps, then the overall code rate of the acquisition array is 15*40=600 Mbps, if the preset bandwidth threshold is 1000 Mbps, Then it is determined that the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than the preset bandwidth threshold. Then, according to the IP addresses of the 40 collection devices in the collection array, streaming commands can be sent to each collection device respectively.

在具体实施中，为了确保采集阵列中各采集设备的参数的数值统一，使得各采集设备能够实时同步采集和数据压缩，在分别向所述采集阵列中各采集设备发送拉流指令之前，可以设置所述采集阵列中各采集设备的参数的数值。其中，所述采集设备的参数可以包括：采集参数和压缩参数，且所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于预设的带宽阈值。In the specific implementation, in order to ensure that the values of the parameters of each acquisition device in the acquisition array are unified, so that each acquisition device can collect and compress data synchronously in real time, before sending the streaming command to each acquisition device in the acquisition array, you can set The value of the parameter of each acquisition device in the acquisition array. Wherein, the parameters of the collection device may include: collection parameters and compression parameters, and each collection device in the collection array acquires the values obtained by real-time synchronous collection and data compression from corresponding angles according to the set values of the parameters of each collection device. The sum of the code rates of the compressed video data streams is not greater than the preset bandwidth threshold.

由于采集参数和压缩参数相辅相成，在压缩参数的数值不变的情况下，可以通过设置采集参数的数值来减小原始视频数据的数据量大小，使得数据压缩处理的时间缩短；在采集参数的数值不变的情况下，设置压缩参数的数值可以相应减小压缩后的数据量，使得数据传输的时间减短。又如，设置较高的压缩率可以节省传输带宽，设置较低的采样率也可以节省传输带宽。因此，可以根据实际情况，设置采集参数和/或压缩参数。Since the acquisition parameters and compression parameters complement each other, when the value of the compression parameter remains unchanged, the data size of the original video data can be reduced by setting the value of the acquisition parameter, so that the time for data compression processing is shortened; Under the same condition, setting the value of the compression parameter can reduce the amount of compressed data correspondingly, so that the time of data transmission is shortened. As another example, setting a higher compression rate can save transmission bandwidth, and setting a lower sampling rate can also save transmission bandwidth. Therefore, acquisition parameters and/or compression parameters may be set according to actual conditions.

由此，在开始拉流之前，可以对采集阵列中各采集设备的参数的数值进行设置，确保采集阵列中各采集设备的参数的数值统一，各采集设备可以从相应角度实时同步采集和数据压缩，并且获得的压缩视频数据流的码率之和不大于预设的带宽阈值，从而可以避免网络拥塞，在带宽资源有限的情况下也可以实现多角度自由视角视频的低时延播放。Therefore, before starting to pull the stream, the values of the parameters of each acquisition device in the acquisition array can be set to ensure that the values of the parameters of each acquisition device in the acquisition array are unified, and each acquisition device can synchronize acquisition and data compression in real time from corresponding angles , and the sum of the code rates of the obtained compressed video data streams is not greater than the preset bandwidth threshold, so that network congestion can be avoided, and low-latency playback of multi-angle free-view videos can also be realized in the case of limited bandwidth resources.

在具体实施例中，采集参数可以包括焦距参数，曝光参数，分辨率参数、编码码率参数和编码格式参数等，压缩参数可以包括压缩率参数，压缩格式参数等，通过设置不同的参数的数值，获得最适合各采集设备所在传输网络的数值。In a specific embodiment, the acquisition parameters may include focal length parameters, exposure parameters, resolution parameters, encoding bit rate parameters, and encoding format parameters, etc., and the compression parameters may include compression rate parameters, compression format parameters, etc., by setting the values of different parameters , to obtain the most suitable value for the transmission network where each acquisition device is located.

为了简化设置流程，节约设置时间，在设置所述采集阵列中各采集设备的参数的数值之前，可以先确定所述采集阵列中各采集设备按照已设置的参数的数值进行采集和数据压缩获得的压缩视频数据流的码率之和是否大于预设的带宽阈值，当获得的压缩视频数据流的码率之和大于预设的带宽阈值时，在分别向所述采集阵列中各采集设备发送拉流指令之前，可以设置所述采集阵列中各采集设备的参数的数值。可以理解的是，在具体实施中，也可以根据需要展示的多角度自由视角图像的分辨率等成像质量要求设置所述采集参数的数值和压缩参数的数值。In order to simplify the setting process and save setting time, before setting the value of the parameters of each acquisition device in the acquisition array, it can be determined that each acquisition device in the acquisition array performs acquisition and data compression according to the value of the parameter that has been set. Whether the sum of the bit rates of the compressed video data streams is greater than the preset bandwidth threshold, when the sum of the bit rates of the obtained compressed video data streams is greater than the preset bandwidth threshold, each acquisition device in the acquisition array sends a pull Before the streaming instruction, the value of the parameter of each acquisition device in the acquisition array can be set. It can be understood that, in a specific implementation, the value of the acquisition parameter and the value of the compression parameter can also be set according to imaging quality requirements such as the resolution of the multi-angle free-view image to be displayed.

在具体实施中，各采集设备获得的压缩视频数据流从传输到写入的过程是连续发生的，因此，在分别向所述采集阵列中各采集设备发送拉流指令之前，还可以确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和是否大于预设的写入速度阈值，并在所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和大于预设的写入速度阈值时，可以设置所述采集阵列中各采集设备的参数的数值，使得所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值。In a specific implementation, the process from transmission to writing of the compressed video data streams obtained by each acquisition device occurs continuously. Therefore, before sending a streaming command to each acquisition device in the acquisition array, it can also be determined that the Whether the sum of the bit rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is greater than the preset write speed threshold, and whether the sum of the bit rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array When it is greater than the preset write speed threshold, the value of the parameter of each acquisition device in the acquisition array can be set, so that each acquisition device in the acquisition array can read from a corresponding angle according to the set value of the parameter of each acquisition device. The sum of the code rates of the compressed video data stream obtained by real-time synchronous acquisition and data compression is not greater than the preset writing speed threshold.

在具体实施中，预设的写入速度阈值可以根据存储介质的数据存储写入速度决定。例如，数据处理设备的固态硬盘(Solid State Disk或Solid State Drive，SSD)的数据存储写入速度上限为100Mbps，则预设的写入速度阈值可以为100Mbps。In a specific implementation, the preset writing speed threshold may be determined according to the data storage writing speed of the storage medium. For example, the data storage writing speed upper limit of a solid state disk (Solid State Disk or Solid State Drive, SSD) of the data processing device is 100 Mbps, and the preset writing speed threshold may be 100 Mbps.

采用上述方案，在开始拉流之前，可以确保各采集设备从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值，从而可以避免数据写入拥塞，确保压缩视频数据流在采集、传输和写入的过程中链路畅通，使得各采集设备上传的压缩视频流可以得到实时的处理，进而实现多角度自由视角视频的播放。Using the above solution, before starting to pull the stream, it can be ensured that the sum of the code rates of the compressed video data streams obtained by each acquisition device from the corresponding angle in real-time synchronous acquisition and data compression is not greater than the preset write speed threshold, thereby avoiding Data writing congestion ensures that the compressed video data stream is unblocked during the process of collection, transmission and writing, so that the compressed video stream uploaded by each collection device can be processed in real time, and then realize the playback of multi-angle free-view video.

在具体实施中，可以对各采集设备获得的压缩视频数据流进行存储。当接收到视频帧截取指令时，可以根据接收到的视频帧截取指令，截取各压缩视频数据流中帧级同步的视频帧，将截取得到的视频帧同步上传至所述指定目标端。In a specific implementation, the compressed video data streams obtained by each collection device may be stored. When a video frame interception instruction is received, frame-level synchronized video frames in each compressed video data stream may be intercepted according to the received video frame interception instruction, and the intercepted video frames may be synchronously uploaded to the designated target end.

其中，所述指定目标端可以是预先设置的目标端，也可以是视频帧截取指令指定的目标端。并且，可以先将截取得到的视频帧进行封装，并通过网络传输协议上传至所述指定目标端，再进行解析，获得相应的压缩视频数据流中帧级同步的视频帧。Wherein, the specified target end may be a preset target end, or a target end specified by a video frame interception instruction. In addition, the intercepted video frames may be encapsulated first, and uploaded to the designated target through a network transmission protocol, and then analyzed to obtain frame-level synchronized video frames in the corresponding compressed video data stream.

由此，将压缩视频数据流截取的视频帧的后续处理交由所述指定目标端进行，可以节约网络传输资源，降低现场部署大量服务器资源部署的压力和难度，也可以极大地降低数据处理负荷，缩短多角度自由视角视频帧的传输时延。Therefore, the subsequent processing of the video frame intercepted by the compressed video data stream is handed over to the designated target end, which can save network transmission resources, reduce the pressure and difficulty of deploying a large number of server resources on site, and can also greatly reduce the data processing load. , to shorten the transmission delay of multi-angle free-view video frames.

在具体实施中，为了确保截取各压缩视频数据流中帧级同步的视频帧，如图16所示，可以包括以下步骤：In a specific implementation, in order to ensure that the frame-level synchronized video frames in each compressed video data stream are intercepted, as shown in Figure 16, the following steps may be included:

S161，确定实时接收的所述采集阵列中各采集设备的压缩视频数据流中其中一路压缩视频数据流作为基准数据流；S161. Determine one of the compressed video data streams received in real time from the compressed video data streams of each acquisition device in the acquisition array as a reference data stream;

S162，基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各压缩视频数据流中的视频帧，作为其余各压缩视频数据流的待截取的视频帧；S162. Based on the received video frame interception instruction, determine the video frame to be intercepted in the reference data stream, and select one of the remaining compressed video data streams that is synchronized with the video frame to be intercepted in the reference data stream The video frame is used as the video frame to be intercepted of the remaining compressed video data streams;

S163，截取各压缩视频数据流中待截取的视频帧。S163. Intercept video frames to be intercepted in each compressed video data stream.

为使本领域技术人员更好地理解和实现本发明实施例，以下通过一具体的应用场景详细说明如何确定待各压缩视频数据流中待截取的视频帧。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, how to determine video frames to be intercepted in each compressed video data stream will be described in detail below through a specific application scenario.

在本发明一实施例中，采集阵列中可以包含40个采集设备，因此，可以实时接收40路压缩视频数据流，假设在实时接收的所述采集阵列中各采集设备的压缩视频数据流中，确定采集设备A1’对应的压缩视频数据流A1作为基准数据流，然后，基于接收到的视频帧截取指令中指示截取的视频帧中对象的特征信息X，确定所述基准数据流中与所述对象的特征信息X一致的视频帧a1作为待截取的视频帧，然后根据所述基准数据流中的待截取的视频帧a1中对象的特征信息x1，选取其余各压缩视频数据流A2-A40中与对象的特征信息x1一致的视频帧a2-a40，作为其余各压缩视频数据流的待截取的视频帧。In an embodiment of the present invention, 40 acquisition devices can be included in the acquisition array, therefore, 40 compressed video data streams can be received in real time, assuming that in the compressed video data streams of each acquisition device in the acquisition array received in real time, Determine the compressed video data stream A1 corresponding to the acquisition device A1' as the reference data stream, and then, based on the received video frame interception instruction indicating the feature information X of the object in the intercepted video frame, determine the The video frame a1 with the same characteristic information X of the object is used as the video frame to be intercepted, and then according to the characteristic information x1 of the object in the video frame a1 to be intercepted in the reference data stream, select the remaining compressed video data streams A2-A40 The video frames a2-a40 consistent with the feature information x1 of the object are used as video frames to be intercepted for the remaining compressed video data streams.

其中，对象的特征信息可以包括形状特征信息、颜色特征信息和位置特征信息等其中至少一种。所述视频帧截取指令中指示截取的视频帧中对象的特征信息X，与所述基准数据流中的待截取的视频帧a1中对象的特征信息x1可以是对相同的对象的特征信息的同一表示方式，例如，对象的特征信息X和x1均是二维特征信息；对象的特征信息X和对象的特征信息x1也可以是对相同的对象的特征信息的不同表示方式，例如，对象的特征信息X可以是二维特征信息，而对象的特征信息x1可以是三维特征信息。并且，可以预设一个相似阈值，当满足相似阈值时，可以认为对象的特征信息X与x1一致，或者对象的特征信息x1与其余各压缩视频数据流A2-A40中对象的特征信息x2-x40一致。Wherein, the feature information of the object may include at least one of shape feature information, color feature information, position feature information, and the like. The feature information X of the object in the clipped video frame indicated in the video frame clipping instruction and the feature information x1 of the object in the video frame a1 to be clipped in the reference data stream may be identical to the feature information of the same object. Representation, for example, the feature information X and x1 of the object are both two-dimensional feature information; the feature information X of the object and the feature information x1 of the object can also be different representations of the feature information of the same object, for example, the feature of the object The information X may be two-dimensional characteristic information, and the characteristic information x1 of the object may be three-dimensional characteristic information. Moreover, a similarity threshold can be preset, and when the similarity threshold is met, it can be considered that the characteristic information X of the object is consistent with x1, or the characteristic information x1 of the object is the same as the characteristic information x2-x40 of the object in the other compressed video data streams A2-A40 unanimous.

对象的特征信息的具体表示方式以及相似阈值可以根据预设的多角度自由视角范围和现场的场景决定，本发明实施例不做任何限定。The specific representation manner and the similarity threshold of the feature information of the object may be determined according to the preset multi-angle free viewing angle range and the on-site scene, which are not limited in this embodiment of the present invention.

在本发明另一实施例中，采集阵列中可以包含40个采集设备，因此，可以实时接收40路压缩视频数据流，假设在实时接收的所述采集阵列中各采集设备的压缩视频数据流中，确定采集设备B1’对应的压缩视频数据流B1作为基准数据流，然后，基于接收到的视频帧截取指令中指示截取的视频帧的时间戳信息Y，确定所述基准数据流中与所述时间戳信息Y对应的视频帧b1作为待截取的视频帧，然后根据所述基准数据流中的待截取的视频帧b1中的时间戳信息y1，选取其余各压缩视频数据流B2-B40中与时间戳信息y1一致的视频帧b2-b40，作为其余各压缩视频数据流的待截取的视频帧。In another embodiment of the present invention, 40 acquisition devices can be included in the acquisition array, therefore, 40 compressed video data streams can be received in real time, assuming that in the compressed video data streams of each acquisition device in the acquisition array received in real time Determining the compressed video data stream B1 corresponding to the acquisition device B1' as the reference data stream, and then, based on the time stamp information Y indicating the intercepted video frame in the received video frame interception instruction, determining the difference between the reference data stream and the The video frame b1 corresponding to the timestamp information Y is used as the video frame to be intercepted, and then according to the timestamp information y1 in the video frame b1 to be intercepted in the reference data stream, select the remaining compressed video data streams B2-B40 and The video frames b2-b40 whose timestamp information is consistent with y1 are used as video frames to be intercepted for the remaining compressed video data streams.

其中，所述视频帧截取指令中指示截取的视频帧的时间戳信息Y，与所述基准数据流中的待截取的视频帧b1中的时间戳信息y1可以有一定的误差，例如，所述基准数据流中视频帧对应的时间戳信息均与时间戳信息Y不一致，存在0.1ms的误差，则可以预设一个误差范围，例如，误差范围为±1ms，则0.1ms的误差在误差范围内，因此，可以选取与时间戳信息Y相差0.1ms的时间戳信息y1对应的视频帧b1作为基准数据流中的待截取的视频帧。具体的误差范围以及基准数据流中的时间戳信息y1的选取规则可以根据现场的采集设备和传输网络决定，本实施例不做限定。Wherein, the time stamp information Y indicating the video frame to be intercepted in the video frame interception instruction may have a certain error from the time stamp information y1 in the video frame b1 to be intercepted in the reference data stream, for example, the The timestamp information corresponding to the video frame in the reference data stream is inconsistent with the timestamp information Y. If there is an error of 0.1ms, an error range can be preset. For example, if the error range is ±1ms, then the error of 0.1ms is within the error range Therefore, the video frame b1 corresponding to the time stamp information y1 with a difference of 0.1 ms from the time stamp information Y may be selected as the video frame to be intercepted in the reference data stream. The specific error range and the selection rule of the time stamp information y1 in the reference data stream can be determined according to the on-site collection equipment and the transmission network, which is not limited in this embodiment.

可以理解的是，上述实施例中确定待各压缩视频数据流中待截取的视频帧的方法可以单独使用，也可以同时使用，本发明实施例不做限定。It can be understood that the method for determining the video frame to be intercepted in each compressed video data stream in the above embodiment can be used alone or at the same time, which is not limited in this embodiment of the present invention.

利用上述数据处理方法，使得数据处理设备能够顺利、流畅地拉取各采集设备采集和数据压缩到的数据。Using the above data processing method, the data processing device can smoothly and smoothly pull the data collected and compressed by each collection device.

下面将结合本说明书实施例中的附图，对本说明书实施例中采集阵列进行数据处理的技术方案进行清楚、完整地描述。The following will clearly and completely describe the technical solution for data processing by the acquisition array in the embodiment of the present specification with reference to the accompanying drawings in the embodiment of the present specification.

参照图17所示的数据处理方法的流程图，在本发明实施例中，具体可以包括如下步骤：Referring to the flow chart of the data processing method shown in FIG. 17, in the embodiment of the present invention, the following steps may be specifically included:

S171，采集阵列中根据预设的多角度自由视角范围置于现场采集区域不同位置的各采集设备分别从相应角度实时同步采集原始视频数据，并分别对采集到的原始视频数据进行实时数据压缩，获得相应的压缩视频数据流。S171. Each acquisition device in the acquisition array placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range collects raw video data synchronously in real time from corresponding angles, and performs real-time data compression on the collected raw video data respectively, A corresponding compressed video data stream is obtained.

S172，与所述采集阵列链路连接的数据处理设备在确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值时，分别向所述采集阵列中各采集设备发送拉流指令。S172. When the data processing device connected to the acquisition array link determines that the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than the preset bandwidth threshold, send Each acquisition device in the array sends a streaming command.

在具体实施中，预设的带宽阈值可以根据采集阵列中各采集设备所在传输网络的传输能力决定，例如，传输网络的上行带宽为1000Mbps，则预设的带宽值可以为1000Mbps。In a specific implementation, the preset bandwidth threshold may be determined according to the transmission capability of the transmission network where each collection device in the collection array is located. For example, if the uplink bandwidth of the transmission network is 1000 Mbps, the preset bandwidth value may be 1000 Mbps.

S173，所述采集阵列中各采集设备基于所述拉流指令，将获得的压缩视频数据流实时传输至所述数据处理设备。S173. Each acquisition device in the acquisition array transmits the obtained compressed video data stream to the data processing device in real time based on the streaming instruction.

在具体实施中，所述数据处理设备可以根据实际情景设置。例如，当现场有适合空间时，数据处理设备可以置于现场非采集区域，作为现场服务器；当现场无适合空间时，所述数据处理设备可以置于云端，作为云端服务器。In a specific implementation, the data processing device may be set according to an actual scenario. For example, when there is suitable space on site, the data processing equipment can be placed in the non-collection area of the site as an on-site server; when there is no suitable space on site, the data processing equipment can be placed in the cloud as a cloud server.

采用上述方案，与所述采集阵列链路连接的数据处理设备在确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值时，分别向所述采集阵列中各采集设备发送拉流指令，使得各采集设备采集和数据压缩得到的数据能够实时同步传输，从而可以通过所在传输网络进行实时拉流，并且可以避免拉流过程中数据传输拥堵；然后，采集阵列中各采集设备基于所述拉流指令，将获得的压缩视频数据流实时传输至所述数据处理设备，由于各采集设备传输的数据经过压缩，因而可以缓解实时同步传输的带宽压力，加快了多角度自由视角视频数据的处理速度。Using the above scheme, when the data processing device connected to the acquisition array link determines that the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than the preset bandwidth threshold, the Each acquisition device in the acquisition array sends a streaming command, so that the data collected by each acquisition device and data compression can be transmitted synchronously in real time, so that real-time streaming can be performed through the transmission network where it is located, and data transmission congestion during the streaming process can be avoided; Then, each acquisition device in the acquisition array transmits the obtained compressed video data stream to the data processing device in real time based on the streaming instruction. Since the data transmitted by each acquisition device is compressed, the bandwidth pressure of real-time synchronous transmission can be alleviated. , speeding up the processing speed of multi-angle free-view video data.

由此，可以避免在现场布置大量服务器进行数据处理，也无需通过SDI采集卡汇总采集的原始数据，再通过现场机房中的计算服务器对原始数据进行处理，可以免于采用昂贵的SDI视频传输线缆和SDI接口，而是通过普通传输网络进行数据传输，在带宽资源及数据处理资源有限的情况下可以实现多角度自由视角视频的低时延播放，降低实施成本。As a result, it is possible to avoid arranging a large number of servers for data processing on site, and it is not necessary to summarize the collected raw data through the SDI capture card, and then process the raw data through the computing server in the on-site computer room, which can avoid the use of expensive SDI video transmission lines Instead of cable and SDI interfaces, data transmission is carried out through ordinary transmission networks. In the case of limited bandwidth resources and data processing resources, low-latency playback of multi-angle free-view videos can be achieved, reducing implementation costs.

在具体实施中，为了简化设置流程，节约设置时间，在设置所述采集阵列中各采集设备的参数的数值之前，数据处理设备可以先确定所述采集阵列中各采集设备按照已设置的参数的数值进行采集和数据压缩获得的压缩视频数据流的码率之和是否大于预设的带宽阈值，当获得的压缩视频数据流的码率之和大于预设的带宽阈值时，数据处理设备可以设置所述采集阵列中各采集设备的参数的数值，再分别向所述采集阵列中各采集设备发送拉流指令。In a specific implementation, in order to simplify the setting process and save setting time, before setting the values of the parameters of each acquisition device in the acquisition array, the data processing device can first determine that each acquisition device in the acquisition array is set according to the set parameters. Whether the sum of the code rates of the compressed video data streams obtained by collecting and compressing the data is greater than the preset bandwidth threshold, when the sum of the code rates of the compressed video data streams obtained is greater than the preset bandwidth threshold, the data processing device can set The value of the parameter of each acquisition device in the acquisition array is collected, and then a streaming instruction is sent to each acquisition device in the acquisition array respectively.

在具体实施中，各采集设备获得的压缩视频数据流从传输到写入的过程是连续发生的，还需要确保数据处理设备写入各采集设备获得的压缩视频数据流时畅通，因此，在分别向所述采集阵列中各采集设备发送拉流指令之前，数据处理设备还可以确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和是否大于预设的写入速度阈值，并在所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和大于预设的写入速度阈值时，数据处理设备可以设置所述采集阵列中各采集设备的参数的数值，使得所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值。In the specific implementation, the process from transmission to writing of the compressed video data stream obtained by each acquisition device occurs continuously, and it is also necessary to ensure that the data processing device writes the compressed video data stream obtained by each acquisition device smoothly. Before sending the streaming command to each acquisition device in the acquisition array, the data processing device may also determine whether the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is greater than a preset write speed threshold , and when the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is greater than the preset write speed threshold, the data processing device can set the value of the parameter of each acquisition device in the acquisition array , so that the sum of the code rates of the compressed video data streams obtained by real-time synchronous acquisition and data compression from corresponding angles for each acquisition device in the acquisition array according to the set parameter values of each acquisition device is not greater than the preset Write speed threshold.

在具体实施中，预设的写入速度阈值可以根据数据处理设备的数据存储写入速度决定。In a specific implementation, the preset writing speed threshold may be determined according to the data storage writing speed of the data processing device.

在具体实施中，所述采集阵列中各采集设备和所述数据处理设备之间，可以通过以下至少一种方式进行数据传输：In a specific implementation, data transmission may be performed in at least one of the following ways between each collection device in the collection array and the data processing device:

1、通过交换机进行数据传输；1. Data transmission through the switch;

通过交换机将所述采集阵列中各采集设备与数据处理设备进行连接，所述交换机可以将更多的采集设备的压缩视频数据流进行汇总统一传输给数据处理设备，可以减少数据处理设备支持的端口数量。例如，交换机支持40个输入，因此数据处理设备通过所述交换机，则最多可以同时接收40台采集设备组成的采集阵列的视频流，进而可以减少数据处理设备的数量。Each acquisition device in the acquisition array is connected to the data processing device through a switch, and the switch can summarize and uniformly transmit the compressed video data streams of more acquisition devices to the data processing device, which can reduce the number of ports supported by the data processing device quantity. For example, the switch supports 40 inputs, so the data processing device can simultaneously receive video streams from a collection array composed of 40 collection devices at most through the switch, thereby reducing the number of data processing devices.

2、通过局域网进行数据传输。2. Data transmission through LAN.

通过局域网将所述采集阵列中各采集设备与数据处理设备进行连接，所述局域网可以实时将采集设备的压缩视频数据流传输给数据处理设备，减少数据处理设备支持的端口数量，进而可以减少数据处理设备的数量。Each acquisition device in the acquisition array is connected to the data processing device through a local area network, and the local area network can transmit the compressed video data stream of the acquisition device to the data processing device in real time, reducing the number of ports supported by the data processing device, thereby reducing data The number of processing devices.

在具体实施中，所述数据处理设备可以对各采集设备获得的压缩视频数据流进行存储(可以是缓存)，并在接收到的视频帧截取指令时，所述数据处理设备可以根据接收到的视频帧截取指令截取各压缩视频数据流中帧级同步的视频帧，将截取得到的视频帧同步上传至所述指定目标端。In a specific implementation, the data processing device can store (can be buffered) the compressed video data streams obtained by each acquisition device, and when receiving a video frame interception instruction, the data processing device can according to the received The video frame interception instruction intercepts frame-level synchronized video frames in each compressed video data stream, and synchronously uploads the intercepted video frames to the designated target end.

其中，所述数据处理设备可以预先通过端口或IP地址与一目标端建立连接，也可以将截取得到的视频帧同步上传至所述视频帧截取指令指定的端口或IP地址。并且，所述数据处理设备可以先将截取得到的视频帧进行封装，并通过网络传输协议上传至所述指定目标端，再进行解析，获得相应的压缩视频数据流中帧级同步的视频帧。Wherein, the data processing device may establish a connection with a target end through a port or an IP address in advance, and may also synchronously upload the intercepted video frame to the port or IP address specified by the video frame interception instruction. In addition, the data processing device may first encapsulate the intercepted video frames, upload them to the designated target through a network transmission protocol, and then analyze them to obtain frame-level synchronized video frames in the corresponding compressed video data stream.

采用上述方案，可以将采集阵列中各采集设备实时同步采集和数据压缩获得的压缩视频数据流统一传输至数据处理设备，所述数据处理设备在接收到的视频帧截取指令后，经过打点截帧的初步处理，可以将截取到的各压缩视频数据流中帧级同步的视频帧同步上传至所述指定目标端，将压缩视频数据流截取的视频帧的后续处理交由所述指定目标端，从而可以节约网络传输资源，降低现场部署的压力和难度，也可以极大地降低数据处理负荷，缩短多角度自由视角视频帧的传输时延。By adopting the above scheme, the compressed video data stream obtained by real-time synchronous acquisition and data compression of each acquisition device in the acquisition array can be uniformly transmitted to the data processing device, and the data processing device will cut the frame after receiving the video frame interception instruction The initial processing of the compressed video data streams can upload the frame-level synchronized video frames in each compressed video data stream to the designated target end synchronously, and hand over the subsequent processing of the video frames intercepted by the compressed video data streams to the designated target end, In this way, network transmission resources can be saved, the pressure and difficulty of on-site deployment can be reduced, the data processing load can also be greatly reduced, and the transmission delay of multi-angle free-view video frames can be shortened.

在具体实施中，为了截取各压缩视频数据流中帧级同步的视频帧，所述数据处理设备可以先确定实时接收的所述采集阵列中各采集设备的压缩视频数据流中其中一路压缩视频数据流作为基准数据流，然后，所述数据处理设备可以基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各压缩视频数据流中的视频帧，作为其余各压缩视频数据流的待截取的视频帧，最后，所述数据处理设备截取各压缩视频数据流中待截取的视频帧。具体截帧方法可以参见前述实施例的示例，此处不再赘述。In a specific implementation, in order to intercept frame-level synchronized video frames in each compressed video data stream, the data processing device may first determine one of the compressed video data streams received in real time in the compressed video data streams of each acquisition device in the acquisition array. stream as a reference data stream, and then, the data processing device can determine the video frame to be intercepted in the reference data stream based on the received video frame interception instruction, and select the video frame to be intercepted in the reference data stream. The video frames in the remaining compressed video data streams synchronized with the video frames are used as video frames to be intercepted in the remaining compressed video data streams, and finally, the data processing device intercepts the video frames to be intercepted in each compressed video data stream. For a specific frame truncating method, reference may be made to the examples in the foregoing embodiments, and details are not repeated here.

本发明实施例还提供了与上述实施例中数据处理方法相应的数据处理设备，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a data processing device corresponding to the data processing method in the above embodiment. In order to enable those skilled in the art to better understand and realize the embodiment of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings .

参照图18所示的数据处理设备的结构示意图，在本发明实施例中，如图18所示，数据处理设备180可以包括：Referring to the schematic structural diagram of the data processing device shown in FIG. 18, in the embodiment of the present invention, as shown in FIG. 18, the data processing device 180 may include:

第一传输匹配单元181，适于确定采集阵列中各采集设备预传输的压缩视频数据流的码率之和是否不大于预设的带宽阈值，其中，所述采集阵列中各采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置。The first transmission matching unit 181 is adapted to determine whether the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than a preset bandwidth threshold, wherein each acquisition device in the acquisition array according to the preset The multi-angle free viewing angle range is placed in different positions of the field collection area.

指令发送单元182，适于在确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值时，分别向所述采集阵列中各采集设备发送拉流指令。The instruction sending unit 182 is adapted to send a pull request to each acquisition device in the acquisition array when it is determined that the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is not greater than a preset bandwidth threshold. stream instructions.

数据流接收单元183，适于接收所述采集阵列中各采集设备基于所述拉流指令实时传输的压缩视频数据流，所述压缩视频数据流为所述采集阵列中各采集设备分别从相应角度实时同步采集和数据压缩获得。The data stream receiving unit 183 is adapted to receive the compressed video data stream transmitted in real time by each acquisition device in the acquisition array based on the stream pulling instruction, and the compressed video data stream is obtained from each acquisition device in the acquisition array from corresponding angles. Real-time synchronous acquisition and data compression are obtained.

采用上述数据处理设备，向所述采集阵列中各采集设备发送拉流指令之前，确定了传输带宽是否匹配，可以避免拉流过程中数据传输拥堵，使得各采集设备采集和数据压缩得到的数据能够实时同步传输，加快多角度自由视角视频数据的处理速度，在带宽资源及数据处理资源有限的情况下实现多角度自由视角视频，降低实施成本。Using the above-mentioned data processing device, before sending the streaming command to each collection device in the collection array, it is determined whether the transmission bandwidth matches, which can avoid data transmission congestion during the streaming process, so that the data collected by each collection device and data compression can be obtained. Real-time synchronous transmission speeds up the processing speed of multi-angle free-view video data, realizes multi-angle free-view video under the condition of limited bandwidth resources and data processing resources, and reduces implementation costs.

在本发明一实施例中，如图18所示，所述数据处理设备180还可以包括：In an embodiment of the present invention, as shown in FIG. 18, the data processing device 180 may further include:

第一参数设置单元184，适于在分别向所述采集阵列中各采集设备发送拉流指令之前，设置所述采集阵列中各采集设备的参数的数值；The first parameter setting unit 184 is adapted to set the value of the parameter of each acquisition device in the acquisition array before sending the streaming instruction to each acquisition device in the acquisition array respectively;

其中，所述采集设备的参数可以包括：采集参数和压缩参数，且所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于预设的带宽阈值。Wherein, the parameters of the collection device may include: collection parameters and compression parameters, and each collection device in the collection array acquires the values obtained by real-time synchronous collection and data compression from corresponding angles according to the set values of the parameters of each collection device. The sum of the code rates of the compressed video data streams is not greater than the preset bandwidth threshold.

在本发明一实施例中，为了简化设置流程，节约设置时间，如图18所示，所述数据处理设备180还可以包括：In an embodiment of the present invention, in order to simplify the setup process and save setup time, as shown in FIG. 18, the data processing device 180 may further include:

第二传输匹配单元185，适于在设置所述采集阵列中各采集设备的参数的数值之前，确定所述采集阵列中各采集设备按照已设置的参数的数值进行采集和数据压缩获得的压缩视频数据流的码率之和是否不大于预设的带宽阈值。The second transmission matching unit 185 is adapted to determine the compressed video obtained by each acquisition device in the acquisition array by collecting and compressing data according to the set parameter value before setting the value of the parameter of each acquisition device in the acquisition array Whether the sum of the code rates of the data streams is not greater than the preset bandwidth threshold.

写入匹配单元186，适于确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和是否大于预设的写入速度阈值；Write matching unit 186, adapted to determine whether the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is greater than a preset write speed threshold;

第二参数设置单元187，适于在所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和大于预设的写入速度阈值时，设置所述采集阵列中各采集设备的参数的数值，使得所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值。The second parameter setting unit 187 is adapted to set the data rate of each acquisition device in the acquisition array when the sum of the code rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array is greater than the preset write speed threshold The value of the parameter, so that each acquisition device in the acquisition array according to the numerical value of the parameter of each acquisition device set, the sum of the code rates of the compressed video data stream obtained by synchronous acquisition and data compression in real time from the corresponding angle is not greater than the said Preset write speed threshold.

因此，在开始拉流之前，可以确保各采集设备从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值，从而可以避免数据写入拥塞，确保压缩视频数据流在采集、传输和写入的过程中链路畅通，使得各采集设备上传的压缩视频流可以得到实时的处理，进而实现多角度自由视角视频的播放。Therefore, before starting to pull the stream, it can be ensured that the sum of the code rates of the compressed video data streams obtained by each acquisition device from the corresponding angle in real-time synchronous acquisition and data compression is not greater than the preset write speed threshold, thereby avoiding data writing. Incoming congestion, to ensure that the link of the compressed video data stream is smooth during the process of collection, transmission and writing, so that the compressed video stream uploaded by each collection device can be processed in real time, and then realize the playback of multi-angle free-view video.

截帧处理单元188，适于根据接收到的视频帧截取指令，截取各压缩视频数据流中帧级同步的视频帧；The frame interception processing unit 188 is adapted to intercept the frame-level synchronized video frames in each compressed video data stream according to the received video frame interception instruction;

上传单元189，适于将截取得到的视频帧同步上传至所述指定目标端。The uploading unit 189 is adapted to synchronously upload the intercepted video frames to the specified target.

其中，所述指定目标端可以是预先设置的目标端，也可以是视频帧截取指令指定的目标端。Wherein, the specified target end may be a preset target end, or a target end specified by a video frame interception instruction.

由此，将压缩视频数据流截取的视频帧的后续处理交由所述指定目标端进行，从而可以节约网络传输资源，降低现场部署的压力和难度，也可以极大地降低数据处理负荷，缩短多角度自由视角视频帧的传输时延。Thus, the subsequent processing of the video frames intercepted by the compressed video data stream is handed over to the specified target end, thereby saving network transmission resources, reducing the pressure and difficulty of on-site deployment, and also greatly reducing the data processing load and shortening the length of time. The transmission delay of an angle-free perspective video frame.

在本发明一实施例中，如图18所示，所述截帧处理单元188，可以包括：In an embodiment of the present invention, as shown in FIG. 18, the frame truncation processing unit 188 may include:

基准数据流选取子单元1881，适于确定实时接收的所述采集阵列中各采集设备的压缩视频数据流中其中一路压缩视频数据流作为基准数据流；The reference data stream selection subunit 1881 is adapted to determine one of the compressed video data streams of each acquisition device in the acquisition array received in real time as the reference data stream;

视频帧选取子单元1882，适于基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各压缩视频数据流中的视频帧，作为其余各压缩视频数据流的待截取的视频帧；The video frame selection subunit 1882 is adapted to determine the video frame to be intercepted in the reference data stream based on the received video frame interception instruction, and select the remaining video frames synchronized with the video frame to be intercepted in the reference data stream The video frames in each compressed video data stream are used as video frames to be intercepted for the remaining compressed video data streams;

视频帧截取子单元1883，适于截取各压缩视频数据流中待截取的视频帧。The video frame interception subunit 1883 is adapted to intercept video frames to be intercepted in each compressed video data stream.

在本发明一实施例中，如图18所示，所述视频帧选取子单元1882，可以包括以下至少一种：In an embodiment of the present invention, as shown in FIG. 18, the video frame selection subunit 1882 may include at least one of the following:

第一视频帧选取模块18821，适于根据所述基准数据流中的待截取的视频帧中对象的特征信息，选取其余各压缩视频数据流中与所述对象的特征信息一致的视频帧，作为其余各压缩视频数据流的待截取的视频帧；The first video frame selection module 18821 is adapted to select video frames consistent with the feature information of the object in the remaining compressed video data streams according to the feature information of the object in the video frame to be intercepted in the reference data stream, as The video frames to be intercepted of the remaining compressed video data streams;

第二视频帧选取模块18822，适于根据所述基准数据流中的待截取的视频帧的时间戳信息，选取其余各压缩视频数据流中与所述时间戳信息一致的视频帧，作为其余各压缩视频数据流的待截取的视频帧。The second video frame selection module 18822 is adapted to select, according to the time stamp information of the video frame to be intercepted in the reference data stream, video frames consistent with the time stamp information in the remaining compressed video data streams, as the remaining video frames The video frame to be intercepted in the compressed video data stream.

本发明实施例还提供了与上述数据处理方法相应的数据处理系统，采用上述数据处理设备实现实时接收多路压缩视频数据流，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a data processing system corresponding to the above-mentioned data processing method, and adopts the above-mentioned data processing device to realize real-time reception of multiple compressed video data streams. In order to enable those skilled in the art to better understand and implement the embodiment of the present invention, The following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图19所示的数据处理系统的结构示意图，在本发明实施例中，数据处理系统190可以包括：采集阵列191和数据处理设备192，所述采集阵列191包括根据预设的多角度自由视角范围置于现场采集区域不同位置的多个采集设备，其中：With reference to the schematic structural diagram of the data processing system shown in Figure 19, in the embodiment of the present invention, the data processing system 190 may include: an acquisition array 191 and a data processing device 192, the acquisition array 191 includes a preset multi-angle free viewing angle A range of multiple collection devices placed at different locations in the on-site collection area, of which:

所述采集阵列191中各采集设备，适于分别从相应角度实时同步采集原始视频数据，并分别对采集到的原始图像数据进行实时数据压缩，获得从相应角度实时同步采集的压缩视频数据流，且基于所述数据处理设备192发送的拉流指令，将获得的压缩视频数据流实时传输至所述数据处理设备192；Each acquisition device in the acquisition array 191 is suitable for synchronously acquiring original video data from corresponding angles in real time, and performing real-time data compression on the acquired original image data respectively, to obtain compressed video data streams acquired synchronously in real time from corresponding angles, And based on the streaming instruction sent by the data processing device 192, the obtained compressed video data stream is transmitted to the data processing device 192 in real time;

所述数据处理设备192，适于在确定所述采集阵列中各采集设备预传输的压缩视频数据流的码率之和不大于预设的带宽阈值时，分别向所述采集阵列191中各采集设备发送拉流指令，并接收所述采集阵列191中各采集设备实时传输的压缩视频数据流。The data processing device 192 is adapted to, when it is determined that the sum of the code rates of the compressed video data streams pre-transmitted by each collection device in the collection array is not greater than the preset bandwidth threshold, send data to each of the collection arrays 191. The device sends a streaming instruction, and receives the compressed video data stream transmitted in real time by each collection device in the collection array 191 .

采用上述方案，可以避免在现场布置大量服务器进行数据处理，也无需通过SDI采集卡汇总采集的原始数据，再通过现场机房中的计算服务器对原始数据进行处理，可以免于采用昂贵的SDI视频传输线缆和SDI接口，而是通过普通传输网络进行数据传输和拉流，在带宽资源及数据处理资源有限的情况下实现多角度自由视角视频的低时延播放，降低实施成本。Adopting the above-mentioned scheme can avoid arranging a large number of servers on site for data processing, and it is not necessary to summarize the collected raw data through the SDI capture card, and then process the raw data through the computing server in the on-site computer room, which can avoid the use of expensive SDI video transmission Instead of cables and SDI interfaces, data transmission and streaming are carried out through ordinary transmission networks. In the case of limited bandwidth resources and data processing resources, low-latency playback of multi-angle free-view videos can be realized, and implementation costs can be reduced.

在本发明一实施例中，所述数据处理设备192还适于在分别向所述采集阵列191中各采集设备发送拉流指令之前，设置所述采集阵列中各采集设备的参数的数值；In an embodiment of the present invention, the data processing device 192 is further adapted to set the value of the parameter of each collection device in the collection array 191 before sending the streaming instruction to each collection device in the collection array 191;

其中，所述采集设备的参数包括：采集参数和压缩参数，且所述采集阵列中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于预设的带宽阈值。Wherein, the parameters of the acquisition equipment include: acquisition parameters and compression parameters, and each acquisition equipment in the acquisition array according to the set values of the parameters of each acquisition equipment, real-time synchronous acquisition and data compression obtained from the corresponding angle The sum of the code rates of the video data streams is not greater than the preset bandwidth threshold.

由此，在开始拉流之前，数据处理设备可以对采集阵列中各采集设备的参数的数值进行设置，确保采集阵列中各采集设备的参数的数值统一，各采集设备可以从相应角度实时同步采集和数据压缩，并且获得的压缩视频数据流的码率之和不大于预设的带宽阈值，从而可以避免网络拥塞，在带宽资源有限的情况下也可以实现多角度自由视角视频的低时延播放。Therefore, before starting to pull the stream, the data processing device can set the value of the parameters of each acquisition device in the acquisition array to ensure that the values of the parameters of each acquisition device in the acquisition array are unified, and each acquisition device can collect synchronously in real time from the corresponding angle and data compression, and the sum of the bit rates of the obtained compressed video data streams is not greater than the preset bandwidth threshold, so as to avoid network congestion and realize low-latency playback of multi-angle free-view videos in the case of limited bandwidth resources .

在本发明一实施例中，所述数据处理设备192在分别向所述采集阵列191中各采集设备发送拉流指令之前，确定所述采集阵列191中各采集设备预传输的压缩视频数据流的码率之和是否大于预设的写入速度阈值，并在所述采集阵列191中各采集设备预传输的压缩视频数据流的码率之和大于预设的写入速度阈值时，设置所述采集阵列191中各采集设备的参数的数值，使得所述采集阵列192中各采集设备按照设置的所述各采集设备的参数的数值，从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值。In an embodiment of the present invention, before the data processing device 192 sends the streaming command to each collection device in the collection array 191, it determines the compressed video data stream pre-transmitted by each collection device in the collection array 191. Whether the sum of bit rates is greater than the preset write speed threshold, and when the sum of the bit rates of the compressed video data streams pre-transmitted by each acquisition device in the acquisition array 191 is greater than the preset write speed threshold, set the The numerical value of the parameter of each acquisition equipment in acquisition array 191, makes each acquisition equipment in described acquisition array 192 according to the numerical value of the parameter of described each acquisition equipment set, the compressed video data flow that obtains from corresponding angle real-time synchronous acquisition and data compression The sum of the code rates is not greater than the preset write speed threshold.

因此，在开始拉流之前，可以确保各采集设备从相应角度实时同步采集和数据压缩获得的压缩视频数据流的码率之和不大于所述预设的写入速度阈值，从而可以避免数据处理设备的数据写入拥塞，确保压缩视频数据流在采集、传输和写入的过程中链路畅通，使得各采集设备上传的压缩视频流可以得到实时的处理，进而实现多角度自由视角视频的播放。Therefore, before starting to pull the stream, it can be ensured that the sum of the code rates of the compressed video data streams obtained by each acquisition device from corresponding angles in real-time synchronous acquisition and data compression is not greater than the preset write speed threshold, thereby avoiding data processing The data writing congestion of the device ensures that the link of the compressed video data stream is unblocked during the process of collection, transmission and writing, so that the compressed video stream uploaded by each collection device can be processed in real time, thereby realizing the playback of multi-angle free-view video .

在具体实施中，所述采集阵列中各采集设备与所述数据处理设备适于通过交换机和/或局域网进行连接。In a specific implementation, each collection device in the collection array is adapted to be connected to the data processing device through a switch and/or a local area network.

在本发明一实施例中，所述数据处理系统190还可以包括指定目标端193。In an embodiment of the present invention, the data processing system 190 may further include a specified target end 193 .

所述数据处理设备192，适于根据接收到的视频帧截取指令，截取各压缩视频流中帧级同步的视频帧，将截取得到的视频帧同步上传至所述指定目标端193；The data processing device 192 is adapted to intercept frame-level synchronized video frames in each compressed video stream according to the received video frame interception instruction, and upload the intercepted video frames to the designated target terminal 193 synchronously;

所述指定目标端193，适于接收所述数据处理设备192基于视频帧截取指令截取得到的视频帧。The specified target end 193 is adapted to receive the video frame intercepted by the data processing device 192 based on the video frame interception instruction.

其中，所述数据处理设备可以预先通过端口或IP地址与一目标端建立连接，也可以将截取得到的视频帧同步上传至所述视频帧截取指令指定的端口或IP地址。Wherein, the data processing device may establish a connection with a target end through a port or an IP address in advance, and may also synchronously upload the intercepted video frame to the port or IP address specified by the video frame interception instruction.

采用上述方案，可以将采集阵列中各采集设备实时同步采集和数据压缩获得的压缩视频数据流统一传输至数据处理设备，所述数据处理设备在接收到的视频帧截取指令后，经过打点截帧的初步处理，可以将截取到的各压缩视频数据流中帧级同步的视频帧同步上传至所述指定目标端，将压缩视频数据流截取的视频帧的后续处理交由指定目标端进行，从而可以节约网络传输资源，降低现场部署的压力和难度，也可以极大地降低数据处理负荷，缩短多角度自由视角视频帧的传输时延。By adopting the above scheme, the compressed video data stream obtained by real-time synchronous acquisition and data compression of each acquisition device in the acquisition array can be uniformly transmitted to the data processing device, and the data processing device will cut the frame after receiving the video frame interception instruction The preliminary processing of the compressed video data streams can upload the frame-level synchronized video frames in each compressed video data stream to the designated target end synchronously, and the subsequent processing of the video frames intercepted by the compressed video data streams is carried out by the designated target end, thereby It can save network transmission resources, reduce the pressure and difficulty of on-site deployment, and can also greatly reduce the data processing load and shorten the transmission delay of multi-angle free-view video frames.

在本发明一实施例中，所述数据处理设备192适于确定实时接收的所述采集阵列191中各采集设备的压缩视频数据流中其中一路压缩视频数据流作为基准数据流；并基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各压缩视频数据流中的视频帧，作为其余各压缩视频数据流的待截取的视频帧；最后，截取各压缩视频数据流中待截取的视频帧。In an embodiment of the present invention, the data processing device 192 is adapted to determine one of the compressed video data streams of each acquisition device in the acquisition array 191 received in real time as a reference data stream; and based on the received The video frame interception instruction, determine the video frame to be intercepted in the described reference data stream, and select the video frame in the other compressed video data streams synchronous with the video frame to be intercepted in the described reference data stream, as the rest The video frames to be intercepted in each compressed video data stream; finally, the video frames to be intercepted in each compressed video data stream are intercepted.

为使本领域技术人员更好地理解和实现本发明实施例，以下对数据处理设备与采集设备之间的帧同步方案通过具体实施例进行详细的描述。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the frame synchronization solution between the data processing device and the acquisition device will be described in detail below through specific embodiments.

参照图20所示的数据同步方法的流程图，在本发明实施例中，具体可以包括如下步骤：Referring to the flow chart of the data synchronization method shown in FIG. 20, in the embodiment of the present invention, the following steps may be specifically included:

S201，向采集阵列中各采集设备分别发送拉流指令，其中，所述采集阵列中各采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，并且所述采集阵列中各采集设备分别从相应角度实时同步采集视频数据流。S201. Send streaming instructions to each acquisition device in the acquisition array, wherein each acquisition device in the acquisition array is placed at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and each acquisition device in the acquisition array The equipment collects video data streams in real time and synchronously from corresponding angles respectively.

在具体实施中，为实现拉流同步，可以有多种实现方式。例如可以同时向采集阵列中各采集设备同时发出拉流指令；或者，也可以仅向采集阵列中的主采集设备发送拉流指令，触发主采集设备的拉流，之后，由主采集设备将所述拉流指令同步至所有从采集设备，触发所有从采集设备拉流。In a specific implementation, in order to realize synchronization of streaming, there may be multiple implementation manners. For example, the streaming command can be issued to all the acquisition devices in the acquisition array at the same time; or, the streaming command can be sent only to the main acquisition device in the acquisition array to trigger the main acquisition device to pull the current, and then the main acquisition device will The above streaming commands are synchronized to all slave acquisition devices, triggering all slave acquisition devices to pull streams.

S202，实时接收所述采集阵列中各采集设备基于所述拉流指令分别传输的视频数据流，并确定所述采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步。S202. Receive in real time the video data streams respectively transmitted by each acquisition device in the acquisition array based on the streaming instruction, and determine whether the video data streams respectively transmitted by each acquisition device in the acquisition array are frame-level synchronized.

在具体实施中，采集设备本身可以具备编码和封装的功能，从而可以将从相应角度实时同步采集到的原始视频数据进行编码和封装。并且，各采集设备还可以具备压缩功能，压缩率越高，在压缩前数据量相同的情况下可以使得压缩后的数据量更小，可以缓解实时同步传输的带宽压力，因此，采集设备可以采用预测编码、变换编码和熵编码等技术提高视频的压缩率。In a specific implementation, the acquisition device itself may have the functions of encoding and encapsulation, so that the original video data collected synchronously in real time from corresponding angles may be encoded and encapsulated. In addition, each acquisition device can also have a compression function. The higher the compression rate, the smaller the amount of compressed data when the amount of data before compression is the same, which can relieve the bandwidth pressure of real-time synchronous transmission. Therefore, the acquisition device can use Technologies such as predictive coding, transform coding, and entropy coding improve the compression rate of video.

S203，在所述采集阵列中各采集设备分别传输的视频数据流之间未帧级同步时，重新向所述采集阵列中各采集设备分别发送拉流指令，直至所述采集阵列中各采集设备分别传输的视频数据流之间帧级同步。S203. When there is no frame-level synchronization between the video data streams respectively transmitted by the acquisition devices in the acquisition array, re-send streaming instructions to each acquisition device in the acquisition array until each acquisition device in the acquisition array Frame-level synchronization between separately transmitted video data streams.

采用上述数据同步方法，通过确定采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步，可以确保多路数据同步传输，从而可以避免漏帧、多帧的传输问题，提升数据处理速度，进而满足多角度自由视角视频低时延播放的需求。Using the above data synchronization method, by determining whether the video data streams transmitted by each acquisition device in the acquisition array are frame-level synchronous, it can ensure the synchronous transmission of multi-channel data, thereby avoiding the problem of frame leakage and multi-frame transmission, and improving data processing. Speed, and thus meet the needs of low-latency playback of multi-angle free-view video.

在具体实施中，所述采集阵列中各采集设备通过人工启动时，存在启动时间误差，有可能不在同一时刻开始采集视频数据流。因此，可以采用以下至少一种方式，确保所述采集阵列中各采集设备置分别从相应角度实时同步采集视频数据流：In a specific implementation, when each acquisition device in the acquisition array is started manually, there is a start-up time error, and it is possible that the acquisition of the video data stream may not start at the same time. Therefore, at least one of the following methods can be adopted to ensure that each collection device in the collection array collects video data streams in real time and synchronously from corresponding angles:

1、在至少一个采集设备获取到采集起始指令时，获取到所述采集起始指令的采集设备将所述采集起始指令同步到其他采集设备，使得所述采集阵列中各采集设备分别基于所述采集起始指令开始从相应角度实时同步采集视频数据流。1. When at least one collection device obtains the collection start instruction, the collection device that has obtained the collection start instruction synchronizes the collection start instruction to other collection devices, so that each collection device in the collection array is based on The collection start instruction starts synchronously collecting video data streams from corresponding angles in real time.

例如，采集阵列中可以包含40个采集设备，当其中采集设备A1获取到所述采集起始指令时，采集设备A1同步向其他采集设备A2-A40发送获取到的所述采集起始指令，在所有采集设备均收到所述采集起始指令后，各采集设备分别基于所述采集起始指令开始从相应角度实时同步采集视频数据流。由于各采集设备之间的数据传输速度远远快于人工启动的速度，因此，可以减少人工启动产生的启动时间误差。For example, the collection array may contain 40 collection devices. When the collection device A1 obtains the collection start instruction, the collection device A1 synchronously sends the obtained collection start instruction to other collection devices A2-A40. After all the collection devices receive the collection start instruction, each collection device starts synchronously collecting video data streams from corresponding angles in real time based on the collection start instruction. Since the data transmission speed between the acquisition devices is far faster than the speed of manual startup, the startup time error caused by manual startup can be reduced.

2、所述采集阵列中各采集设备分别基于预设的时钟同步信号从相应角度实时同步采集视频数据流。2. Each acquisition device in the acquisition array synchronously acquires video data streams in real time from corresponding angles based on a preset clock synchronization signal.

例如，可以设置一时钟信号同步装置，各采集设备可以分别连接所述时钟信号同步装置，所述时钟信号同步装置在接收到触发信号(如同步采集起始指令)，所述时钟信号同步装置可以向各采集设备发射时钟同步信号，各采集设备分别基于所述时钟同步信号开始从相应角度实时同步采集视频数据流。由于时钟信号发射装置可以基于预设的触发信号，向各采集设备发射时钟同步信号，使得各采集设备可以同步采集，不易受到外部条件和人工操作的干扰，因此，可以提高各采集设备的同步精度及同步效率。For example, a clock signal synchronizing device can be set, and each acquisition device can be connected to the clock signal synchronizing device respectively, and the clock signal synchronizing device can A clock synchronization signal is transmitted to each acquisition device, and each acquisition device starts synchronously collecting video data streams from corresponding angles in real time based on the clock synchronization signal. Since the clock signal transmitting device can transmit a clock synchronization signal to each acquisition device based on a preset trigger signal, so that each acquisition device can collect synchronously and is not easily disturbed by external conditions and manual operations, therefore, the synchronization accuracy of each acquisition device can be improved and synchronization efficiency.

在具体实施中，由于网络传输环境的影响，所述采集阵列中各采集设备有可能无法在同一时刻收到拉流指令，各采集设备之间可能有几毫秒或者更少的时间差，导致各采集设备实时传输的视频数据流不同步，如图21所示，采集阵列中包含采集设备1和2，采集设备1和2的采集参数设置相同，其中采集帧率均为X fps，并且采集设备1和2采集的视频帧帧级同步采集。In the specific implementation, due to the impact of the network transmission environment, each acquisition device in the acquisition array may not be able to receive the streaming command at the same time, and there may be a time difference of a few milliseconds or less between each acquisition device, resulting in each acquisition The video data streams transmitted by the devices in real time are not synchronized. As shown in Figure 21, the acquisition array includes acquisition devices 1 and 2. The acquisition parameters of acquisition devices 1 and 2 are set the same, where the acquisition frame rate is X fps, and acquisition device 1 And 2 captured video frames frame-level synchronous capture.

采集设备1和2中每一帧的采集间隔T均为假设在t0时刻数据处理设备发送拉流指令r，采集设备1在t1时刻接收到拉流指令r，采集设备2在t2时刻接收到拉流指令r，若采集设备1和2均在同一采集间隔T内收到，则可以认为采集设备1和2在同一时刻收到拉流指令，采集设备1和2可以分别传输帧级同步的视频数据流；若采集设备1和2不在同一采集间隔内收到，则可以认为采集设备1和2未在同一时刻收到拉流指令，采集设备1和2无法实现帧级视频数据流的同步传输。视频数据流传输的帧级同步也可以称为拉流同步。拉流同步一旦实现，会自动持续到停止拉流。The acquisition interval T of each frame in acquisition devices 1 and 2 is Assume that the data processing device sends the streaming command r at time t0, the collection device 1 receives the streaming command r at time t1, and the collection device 2 receives the streaming command r at time t2. If both collection devices 1 and 2 are at the same collection interval Received within T, it can be considered that acquisition devices 1 and 2 receive streaming instructions at the same time, and acquisition devices 1 and 2 can transmit frame-level synchronous video data streams respectively; if acquisition devices 1 and 2 do not receive in the same acquisition interval , it can be considered that the acquisition devices 1 and 2 did not receive the streaming command at the same time, and the acquisition devices 1 and 2 cannot realize the synchronous transmission of frame-level video data streams. Frame-level synchronization of video data stream transmission may also be referred to as pull-stream synchronization. Once streaming synchronization is achieved, it will continue automatically until the streaming is stopped.

无法传输帧级同步的视频数据流的原因可以是：Reasons for not being able to transmit a frame-level synchronized video stream can be:

1)需要分别向各采集设备发送拉流指令；1) It is necessary to send streaming commands to each collection device separately;

2)局域网在传输拉流指令时存在延时。2) There is a delay when the LAN transmits streaming commands.

因此，可以采用以下至少一种方式确定所述采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步：Therefore, at least one of the following methods can be used to determine whether the video data streams respectively transmitted by the acquisition devices in the acquisition array are frame-level synchronized:

1、可以在获取所述采集阵列中各采集设备分别传输的视频数据流的第N帧时，匹配各视频数据流的第N帧的对象的特征信息，当各视频数据流的第N帧的对象的特征信息满足预设的相似阈值时，确定所述采集阵列中各采集设备分别传输的视频数据流的第N帧的对象的特征信息一致，进而各采集设备分别传输的视频数据流之间帧级同步。1. When obtaining the Nth frame of the video data stream transmitted by each acquisition device in the acquisition array, match the feature information of the object in the Nth frame of each video data stream, when the Nth frame of each video data stream When the characteristic information of the object satisfies the preset similarity threshold, it is determined that the characteristic information of the object in the Nth frame of the video data stream transmitted by each acquisition device in the acquisition array is consistent, and then the video data streams respectively transmitted by each acquisition device are consistent. Frame-level synchronization.

其中，N为不小于1的整数，各视频数据流的第N帧的对象的特征信息可以包括形状特征信息、颜色特征信息和位置特征信息等其中至少一种。Wherein, N is an integer not less than 1, and the feature information of the object in the Nth frame of each video data stream may include at least one of shape feature information, color feature information, and position feature information.

2、可以在获取所述采集阵列中各采集设备分别传输的视频数据流的第N帧时，匹配各视频数据流的第N帧的时间戳信息，其中，N为不小于1的整数。当各视频数据流的第N帧的时间戳信息一致时，确定各采集设备分别传输的视频数据流之间帧级同步。2. When acquiring the Nth frame of the video data stream respectively transmitted by each acquisition device in the acquisition array, match the time stamp information of the Nth frame of each video data stream, where N is an integer not less than 1. When the time stamp information of the Nth frame of each video data stream is consistent, it is determined that the frame-level synchronization between the video data streams respectively transmitted by each acquisition device is determined.

在所述采集阵列中各采集设备分别传输的视频数据流之间未帧级同步时，重新向所述采集阵列中各采集设备分别发送拉流指令，可以采用以上至少一种方式确定是否帧级同步，直至所述采集阵列中各采集设备分别传输的视频数据流之间帧级同步。When the video data streams transmitted by each acquisition device in the acquisition array are not synchronized at the frame level, re-send streaming instructions to each acquisition device in the acquisition array, and at least one of the above methods can be used to determine whether frame-level synchronization until the frame-level synchronization between the video data streams respectively transmitted by the acquisition devices in the acquisition array.

在具体实施中，还可以截取各采集设备的视频数据流中的视频帧并传输至指定目标端，为了确保截取的视频帧的帧级同步，如图22所示，可以包括以下步骤：In a specific implementation, the video frames in the video data streams of each acquisition device can also be intercepted and transmitted to the designated target end. In order to ensure the frame-level synchronization of the intercepted video frames, as shown in Figure 22, the following steps can be included:

S221，确定实时接收的所述采集阵列中各采集设备的视频数据流中其中一路视频数据流作为基准数据流。S221. Determine one of the video data streams among the video data streams of each acquisition device in the acquisition array received in real time as a reference data stream.

S222，基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各视频数据流中的视频帧，作为其余各视频数据流的待截取的视频帧。S222. Based on the received video frame interception instruction, determine the video frame to be intercepted in the reference data stream, and select videos in the remaining video data streams that are synchronized with the video frame to be intercepted in the reference data stream frame, as the video frame to be intercepted for the remaining video data streams.

S223，截取各视频数据流中待截取的视频帧。S223. Intercept video frames to be intercepted in each video data stream.

S224，将截取得到的视频帧同步上传至所述指定目标端。S224. Synchronously upload the intercepted video frames to the designated target end.

采用上述方案，可以实现截帧同步，提高截帧效率，进一步提高所生成的多角度自由视角视频的显示效果，增强用户体验。并且，可以降低视频帧的选取和截取的过程与生成多角度自由视角视频的过程的耦合性，增强各流程之间的独立性，便于后期维护，将截取得到的视频帧同步上传至所述指定目标端，可以节约网络传输资源和降低数据处理负载，提升数据处理生成多角度自由视角视频的速度。By adopting the above solution, the synchronization of frame cutting can be realized, the efficiency of frame cutting can be improved, the display effect of the generated multi-angle free-view video can be further improved, and user experience can be enhanced. Moreover, it can reduce the coupling between the process of selecting and intercepting video frames and the process of generating multi-angle free-view video, enhance the independence between each process, facilitate later maintenance, and upload the captured video frames to the specified On the target side, it can save network transmission resources and reduce data processing load, and increase the speed of data processing to generate multi-angle free-view video.

为使本领域技术人员更好地理解和实现本发明实施例，以下通过具体的应用示例详细说明如何确定待各视频数据流中待截取的视频帧。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, how to determine video frames to be intercepted in each video data stream will be described in detail below through specific application examples.

一种方式是根据所述基准数据流中的待截取的视频帧中对象的特征信息，选取其余各视频数据流中与所述对象的特征信息一致的视频帧，作为其余各视频数据流的待截取的视频帧。One way is to select, according to the feature information of the object in the video frame to be intercepted in the reference data stream, the video frame consistent with the feature information of the object in the remaining video data streams, as the waiting frame of the remaining video data streams. Captured video frame.

例如，采集阵列中包含40个采集设备，因此，可以实时接收40路视频数据流，假设在实时接收的所述采集阵列中各采集设备的视频数据流中，确定采集设备A1’对应的视频数据流A1作为基准数据流，然后，基于接收到的视频帧截取指令中指示截取的视频帧中对象的特征信息X，确定所述基准数据流中与所述对象的特征信息X一致的视频帧a1作为待截取的视频帧，然后根据所述基准数据流中的待截取的视频帧a1中对象的特征信息x1，选取其余各视频数据流A2-A40中与对象的特征信息x1一致的视频帧a2-a40，作为其余各视频数据流的待截取的视频帧。For example, the acquisition array contains 40 acquisition devices, therefore, 40 video data streams can be received in real time, assuming that in the video data streams of each acquisition device in the acquisition array received in real time, the video data corresponding to the acquisition device A1' is determined Stream A1 is used as the reference data stream, and then, based on the received video frame clipping instruction indicating the feature information X of the object in the clipped video frame, determine the video frame a1 in the reference data stream that is consistent with the feature information X of the object As the video frame to be intercepted, then according to the characteristic information x1 of the object in the video frame a1 to be intercepted in the reference data stream, select the video frame a2 consistent with the characteristic information x1 of the object in the remaining video data streams A2-A40 -a40, as the video frames to be intercepted for the rest of the video data streams.

其中，对象的特征信息可以包括形状特征信息、颜色特征信息和位置特征信息等；所述视频帧截取指令中指示截取的视频帧中对象的特征信息X，与所述基准数据流中的待截取的视频帧a1中对象的特征信息x1可以是对相同的对象的特征信息的同一表示方式，例如，对象的特征信息X和x1均是二维特征信息；对象的特征信息X和对象的特征信息x1也可以是对相同的对象的特征信息的不同表示方式，例如，对象的特征信息X可以是二维特征信息，而对象的特征信息x1可以是三维特征信息。并且，可以预设一个相似阈值，当满足相似阈值时，可以认为对象的特征信息X与x1一致，或者对象的特征信息x1与其余各视频数据流A2-A40中对象的特征信息x2-x40一致。Wherein, the feature information of the object may include shape feature information, color feature information, and position feature information, etc.; the feature information X of the object in the video frame indicated in the video frame clipping instruction is the same as that to be clipped in the reference data stream The feature information x1 of the object in the video frame a1 can be the same representation of the feature information of the same object, for example, the feature information X and x1 of the object are two-dimensional feature information; the feature information X of the object and the feature information of the object x1 may also be different representations of the feature information of the same object, for example, the feature information X of the object may be two-dimensional feature information, and the feature information x1 of the object may be three-dimensional feature information. Moreover, a similarity threshold can be preset, and when the similarity threshold is met, it can be considered that the characteristic information X of the object is consistent with x1, or the characteristic information x1 of the object is consistent with the characteristic information x2-x40 of the object in the remaining video data streams A2-A40 .

对象的特征信息的具体表示方式以及相似阈值可以根据预设的多角度自由视角范围和现场的场景决定，本实施例不做限定。The specific representation manner of the feature information of the object and the similarity threshold may be determined according to the preset multi-angle free viewing angle range and the on-site scene, which are not limited in this embodiment.

另一种方式是，根据所述基准数据流中的视频帧的时间戳信息，选取其余各视频数据流中与所述时间戳信息一致的视频帧，作为其余各视频数据流的待截取的视频帧。Another way is, according to the timestamp information of the video frame in the reference data stream, select the video frame consistent with the timestamp information in the remaining video data streams as the video to be intercepted in the remaining video data streams frame.

例如，采集阵列中可以包含40个采集设备，因此，可以实时接收40路视频数据流，假设在实时接收的所述采集阵列中各采集设备的视频数据流中，确定采集设备B1对应的视频数据流B1作为基准数据流，然后，基于接收到的视频帧截取指令中指示截取的视频帧的时间戳信息Y，确定所述基准数据流中与所述时间戳信息Y对应的视频帧b1作为待截取的视频帧，然后根据所述基准数据流中的待截取的视频帧b1中的时间戳信息y1，选取其余各视频数据流B2-B40中与时间戳信息y1一致的视频帧b2-b40，作为其余各视频数据流的待截取的视频帧。For example, the acquisition array may include 40 acquisition devices, therefore, 40 video data streams may be received in real time, assuming that the video data corresponding to acquisition device B1 is determined in the video data streams of each acquisition device in the acquisition array received in real time stream B1 as the reference data stream, and then, based on the time stamp information Y indicating the video frame to be intercepted in the received video frame interception instruction, determine the video frame b1 corresponding to the time stamp information Y in the reference data stream as the to-be The intercepted video frame, then according to the timestamp information y1 in the video frame b1 to be intercepted in the reference data stream, select the video frame b2-b40 consistent with the timestamp information y1 in the remaining video data streams B2-B40, As the video frames to be intercepted for the remaining video data streams.

可以理解的是，上述实施例中确定待各视频数据流中待截取的视频帧的方法可以单独使用，也可以同时使用，本发明实施例不做限定。It can be understood that, the method for determining the video frame to be intercepted in each video data stream in the foregoing embodiment may be used alone or at the same time, which is not limited in this embodiment of the present invention.

采用上述方案，可以提高视频帧的同步选取和同步截取的效率和结果准确率，从而可以提升传输数据的完整性和同步性。By adopting the above solution, the efficiency and result accuracy of synchronous selection and synchronous interception of video frames can be improved, thereby improving the integrity and synchronization of transmitted data.

本发明实施例还提供了与上述数据处理方法相应的数据处理设备，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a data processing device corresponding to the above data processing method. In order to enable those skilled in the art to better understand and implement the embodiment of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图23所示的数据处理设备的结构示意图，在本发明实施例中，如图23所示，数据处理设备230可以包括：Referring to the schematic structural diagram of the data processing device shown in FIG. 23, in the embodiment of the present invention, as shown in FIG. 23, the data processing device 230 may include:

指令发送单元231，适于向采集阵列中各采集设备分别发送拉流指令，其中，所述采集阵列中各采集设备根据预设的多角度自由视角范围置于现场采集区域不同位置，并且所述采集阵列中各采集设备置分别从相应角度实时同步采集视频数据流；The instruction sending unit 231 is adapted to send streaming instructions to each acquisition device in the acquisition array, wherein each acquisition device in the acquisition array is placed at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, and the Each acquisition device in the acquisition array collects video data streams in real time and synchronously from corresponding angles;

数据流接收单元232，适于实时接收所述采集阵列中各采集设备基于所述拉流指令分别传输的视频数据流；The data stream receiving unit 232 is adapted to receive in real time the video data streams respectively transmitted by each acquisition device in the acquisition array based on the stream pulling instruction;

第一同步判断单元233，适于确定所述采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步，并在所述采集阵列中各采集设备分别传输的视频数据流之间未帧级同步时，重新触发所述指令发送单元231，直至所述采集阵列中各采集设备分别传输的视频数据流之间帧级同步。The first synchronization judging unit 233 is adapted to determine whether the video data streams transmitted by the acquisition devices in the acquisition array are frame-level synchronous, and whether the video data streams transmitted by the acquisition devices in the acquisition array are not synchronized at the frame level. During frame-level synchronization, the instruction sending unit 231 is re-triggered until the frame-level synchronization between the video data streams respectively transmitted by the acquisition devices in the acquisition array.

其中，所述数据处理设备可以根据实际情景设置。例如，当现场有空余空间时，所述数据处理设备可以置于现场非采集区域，作为现场服务器；当现场没有空余空间时，所述数据处理设备可以置于云端，作为云端服务器。Wherein, the data processing device may be set according to actual scenarios. For example, when there is free space on site, the data processing device can be placed in a non-acquisition area on site as an on-site server; when there is no free space on site, the data processing device can be placed in the cloud as a cloud server.

采用上述数据处理设备，通过确定采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步，可以确保多路数据同步传输，从而可以避免漏帧、多帧的传输问题，提升数据处理速度，进而满足多角度自由视角视频低时延播放的需求。Using the above data processing equipment, by determining whether the video data streams transmitted by each acquisition equipment in the acquisition array are frame-level synchronous, it can ensure the synchronous transmission of multi-channel data, thereby avoiding the problem of frame leakage and multi-frame transmission, and improving data processing Speed, and thus meet the needs of low-latency playback of multi-angle free-view video.

在本发明一实施例中，如图23所示，数据处理设备230还可以包括：In an embodiment of the present invention, as shown in FIG. 23, the data processing device 230 may further include:

基准视频流确定单元234，适于确定实时接收到的所述采集阵列中各采集设备的视频数据流中其中一路视频数据流作为基准数据流；The reference video stream determination unit 234 is adapted to determine one of the video data streams among the video data streams of each acquisition device in the acquisition array received in real time as the reference data stream;

视频帧选取单元235，适于基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各视频数据流中的视频帧，作为其余各视频数据流的待截取的视频帧；The video frame selection unit 235 is adapted to determine the video frame to be intercepted in the reference data stream based on the received video frame interception instruction, and select the remaining video frames synchronized with the video frame to be intercepted in the reference data stream. The video frame in the video data stream is used as the video frame to be intercepted of the remaining video data streams;

视频帧截取单元236，适于截取各视频数据流中待截取的视频帧；A video frame intercepting unit 236, adapted to intercept video frames to be intercepted in each video data stream;

上传单元237，适于将截取得到的视频帧同步上传至所述指定目标端。The uploading unit 237 is adapted to synchronously upload the intercepted video frames to the designated target.

其中，所述数据处理设备230可以预先通过端口或IP地址与一目标端建立连接，也可以将截取得到的视频帧同步上传至所述视频帧截取指令指定的端口或IP地址。Wherein, the data processing device 230 may establish a connection with a target end through a port or an IP address in advance, and may also synchronously upload the intercepted video frame to the port or IP address specified by the video frame interception instruction.

采用上述方案，可以实现截帧同步，提高截帧效率，进一步提高所生成的多角度自由视角视频的显示效果，增强用户体验。并且，降低视频帧的选取和截取的过程与生成多角度自由视角视频的过程的耦合性，增强各流程之间的独立性，便于后期维护，将截取得到的视频帧同步上传至所述指定目标端，可以节约网络传输资源和降低数据处理负载，提升数据处理生成多角度自由视角视频的速度。By adopting the above solution, the synchronization of frame cutting can be realized, the efficiency of frame cutting can be improved, the display effect of the generated multi-angle free-view video can be further improved, and user experience can be enhanced. Moreover, the coupling between the process of selecting and intercepting video frames and the process of generating multi-angle free-view video is reduced, the independence between each process is enhanced, and it is convenient for later maintenance, and the captured video frames are uploaded to the specified target synchronously The terminal can save network transmission resources and reduce data processing load, and improve the speed of data processing to generate multi-angle free-view video.

在本发明一实施例中，如图23所示，所述视频帧选取单元235包括以下至少一种：In an embodiment of the present invention, as shown in FIG. 23, the video frame selection unit 235 includes at least one of the following:

第一视频帧选取模块2351，适于根据所述基准数据流中的待截取的视频帧中对象的特征信息，选取其余各视频数据流中与所述对象的特征信息一致的视频帧，作为其余各视频数据流的待截取的视频帧；The first video frame selection module 2351 is adapted to select video frames consistent with the feature information of the object in the remaining video data streams according to the feature information of the object in the video frame to be intercepted in the reference data stream, as the rest Video frames to be intercepted for each video data stream;

第二视频帧选取模块2352，适于根据所述基准数据流中的视频帧的时间戳信息，选取其余各视频数据流中与所述时间戳信息一致的视频帧，作为其余各视频数据流的待截取的视频帧。The second video frame selection module 2352 is adapted to select, according to the time stamp information of the video frames in the reference data stream, video frames consistent with the time stamp information in the remaining video data streams, as the time stamp information of the remaining video data streams The video frame to be captured.

本发明实施例还提供了与上述数据处理方法相应的数据同步系统，采用上述数据处理设备实现实时接收多路视频数据流，为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a data synchronization system corresponding to the above-mentioned data processing method. The above-mentioned data processing device is used to realize real-time reception of multiple video data streams. In order to enable those skilled in the art to better understand and implement the embodiment of the present invention, the following With reference to the accompanying drawings, detailed description will be given through specific embodiments.

参照图24所示的数据同步系统的结构示意图，在本发明实施例中，所述数据同步系统240可以包括：置于现场采集区域的采集阵列241和置于与所述采集阵列链路连接的数据处理设备242，所述采集阵列241包括多个采集设备，所述采集阵列241中各采集设备根据预设的多角度自由视角范围至于现场采集区域不同位置，其中：With reference to the structural diagram of the data synchronization system shown in Figure 24, in the embodiment of the present invention, the data synchronization system 240 may include: a collection array 241 placed in the on-site collection area and a collection array 241 placed in a link connection with the collection array Data processing device 242, the acquisition array 241 includes a plurality of acquisition devices, each acquisition device in the acquisition array 241 is located at different positions in the on-site acquisition area according to the preset multi-angle free viewing angle range, wherein:

所述采集阵列241中各采集设备，适于分别从相应角度实时同步采集视频数据流，并基于所述数据处理设备242发送的拉流指令，将获得的视频数据流实时传输至所述数据处理设备242；Each acquisition device in the acquisition array 241 is suitable for synchronously acquiring video data streams from corresponding angles in real time, and based on the streaming command sent by the data processing device 242, the obtained video data streams are transmitted to the data processing unit in real time. device 242;

所述数据处理设备242，适于向所述采集阵列241中各采集设备分别发送拉流指令，且实时接收所述采集阵列241中各采集设备基于所述拉流指令分别传输的视频数据流，并在所述采集阵列241中各采集设备分别传输的视频数据流之间未帧级同步时，重新向所述采集阵列241中各采集设备分别发送拉流指令，直至所述采集阵列241中各采集设备传输的视频数据流之间帧级同步。The data processing device 242 is adapted to send streaming instructions to each collection device in the collection array 241, and receive in real time the video data streams respectively transmitted by each collection device in the collection array 241 based on the streaming instructions, And when there is no frame-level synchronization between the video data streams transmitted by each acquisition device in the acquisition array 241, re-send the streaming instruction to each acquisition device in the acquisition array 241, until each acquisition device in the acquisition array 241 Frame-level synchronization between the video data streams transmitted by the capture device.

采用本发明实施例中的数据同步系统，通过确定采集阵列中各采集设备分别传输的视频数据流之间是否帧级同步，可以确保多路数据同步传输，从而可以避免漏帧、多帧的传输问题，提升数据处理速度，进而满足多角度自由视角视频低时延播放的需求。By adopting the data synchronization system in the embodiment of the present invention, by determining whether the video data streams transmitted by each acquisition device in the acquisition array are frame-level synchronized, it can ensure the synchronous transmission of multi-channel data, thereby avoiding frame-missing and multi-frame transmission problem, improve the data processing speed, and then meet the needs of low-latency playback of multi-angle free-view video.

在具体实施中，所述数据处理设备242，还适于确定实时接收的所述采集阵列241中各采集设备的视频数据流中其中一路视频数据流作为基准数据流；基于接收到的视频帧截取指令，确定所述基准数据流中的待截取的视频帧，并选取与所述基准数据流中的待截取的视频帧同步的其余各视频数据流中的视频帧，作为其余各视频数据流的待截取的视频帧；截取各视频数据流中待截取的视频帧并将截取得到的视频帧同步上传至所述指定目标端。In a specific implementation, the data processing device 242 is also adapted to determine one of the video data streams in the video data streams of each acquisition device in the acquisition array 241 received in real time as a reference data stream; based on the received video frame interception Instructions, determine the video frame to be intercepted in the reference data stream, and select video frames in the remaining video data streams synchronized with the video frame to be intercepted in the reference data stream, as the video frames of the remaining video data streams The video frame to be intercepted: intercepting the video frame to be intercepted in each video data stream and synchronously uploading the intercepted video frame to the specified target end.

其中，所述数据处理设备240可以预先通过端口或IP地址与一目标端建立连接，也可以将截取得到的视频帧同步上传至所述视频帧截取指令指定的端口或IP地址。Wherein, the data processing device 240 may establish a connection with a target end through a port or an IP address in advance, and may also synchronously upload the intercepted video frame to the port or IP address specified by the video frame interception instruction.

在本发明一实施例中，所述数据同步系统240还可以包括云端服务器，适于作为指定目标端。In an embodiment of the present invention, the data synchronization system 240 may also include a cloud server, which is suitable as a designated target.

在本发明另一实施例中，如图34所示，所述数据同步系统240还可以包括播放控制设备341，适于作为指定目标端。In another embodiment of the present invention, as shown in FIG. 34 , the data synchronization system 240 may further include a playback control device 341, which is suitable as a designated target end.

在本发明又一实施例中，如图35所示，所述数据同步系统240还可以包括交互终端351，适于作为指定目标端。In yet another embodiment of the present invention, as shown in FIG. 35 , the data synchronization system 240 may further include an interactive terminal 351, which is suitable as a specified target terminal.

在本发明一实施例中，可以采用以下至少一种方式，确保所述采集阵列241中各采集设备置分别从相应角度实时同步采集视频数据流：In an embodiment of the present invention, at least one of the following methods can be adopted to ensure that each acquisition device in the acquisition array 241 collects video data streams in real time and synchronously from corresponding angles:

1、所述采集阵列中各采集设备之间通过同步线进行连接，其中，在至少一个采集设备获取到采集起始指令时，获取到所述采集起始指令的采集设备通过同步线将所述采集起始指令同步到其他采集设备，使得所述采集阵列中各采集设备分别基于所述采集起始指令开始从相应角度实时同步采集视频数据流；1. The collection devices in the collection array are connected through a synchronization line, wherein, when at least one collection device obtains a collection start instruction, the collection device that has obtained the collection start instruction sends the The acquisition start instruction is synchronized to other acquisition devices, so that each acquisition device in the acquisition array starts synchronously collecting video data streams from corresponding angles in real time based on the acquisition start instruction;

为使本领域技术人员更好地理解和实现本发明实施例，以下通过具体的应用场景详细说明数据同步系统，如图25所示应用场景中的数据同步系统的结构示意图，其中，所述数据同步系统包括由各采集设备组成的采集阵列251、数据处理设备252、云端的服务器集群253。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the data synchronization system is described in detail below through specific application scenarios. FIG. 25 shows a schematic structural diagram of the data synchronization system in the application scenario, wherein the data The synchronization system includes a collection array 251 composed of various collection devices, a data processing device 252, and a server cluster 253 in the cloud.

所述采集阵列251中各采集设备中的至少一个采集设备获取到采集起始指令，并通过同步线254将获取到的所述采集起始指令同步到其他采集设备，使得所述采集阵列中各采集设备分别基于所述采集起始指令开始从相应角度实时同步采集视频数据流。At least one acquisition device in each acquisition device in the acquisition array 251 acquires the acquisition start command, and synchronizes the acquired acquisition start command to other acquisition devices through the synchronization line 254, so that each acquisition device in the acquisition array The acquisition device respectively starts synchronously acquiring video data streams from corresponding angles in real time based on the acquisition start instruction.

所述数据处理设备252可以通过无线局域网向所述采集阵列251中各采集设备分别发送拉流指令。所述采集阵列251中各采集设备基于所述数据处理设备252发送的拉流指令，通过交换机255将获得的视频数据流实时传输至所述数据处理设备252。The data processing device 252 may send streaming instructions to each collection device in the collection array 251 through a wireless local area network. Each acquisition device in the acquisition array 251 transmits the obtained video data stream to the data processing device 252 in real time through the switch 255 based on the streaming instruction sent by the data processing device 252 .

所述数据处理设备252确定在所述采集阵列251中各采集设备分别传输的视频数据流之间是否帧级同步，并在所述采集阵列251中各采集设备分别传输的视频数据流之间未帧级同步时，重新向所述采集阵列251中各采集设备分别发送拉流指令，直至所述采集阵列251中各采集设备传输的视频数据流之间帧级同步。The data processing device 252 determines whether the video data streams transmitted by the acquisition devices in the acquisition array 251 are frame-level synchronous, and whether the video data streams transmitted by the acquisition devices in the acquisition array 251 are synchronized at the frame level. During frame-level synchronization, re-send streaming instructions to each acquisition device in the acquisition array 251 until the frame-level synchronization between the video data streams transmitted by each acquisition device in the acquisition array 251 .

所述数据处理设备252确定所述采集阵列251中各采集设备传输的视频数据流之间帧级同步后，确定实时接收的所述采集阵列251中各采集设备的视频数据流中其中一路视频数据流作为基准数据流，并且，在接收到的视频帧截取指令之后，根据所述视频帧截取指令确定所述基准数据流中的待截取的视频帧，然后，所述数据处理设备252选取与所述基准数据流中的待截取的视频帧同步的其余各视频数据流中的视频帧，作为其余各视频数据流的待截取的视频帧，再截取各视频数据流中待截取的视频帧并将截取得到的视频帧同步上传至云端。After the data processing device 252 determines the frame-level synchronization between the video data streams transmitted by the acquisition devices in the acquisition array 251, it determines that one of the video data streams in the video data streams of each acquisition device in the acquisition array 251 received in real time stream as the reference data stream, and, after receiving the video frame clipping instruction, determine the video frame to be clipped in the reference data stream according to the video frame clipping instruction, and then, the data processing device 252 selects the The video frames in the remaining video data streams synchronized with the video frames to be intercepted in the reference data stream are used as the video frames to be intercepted in the remaining video data streams, and then the video frames to be intercepted in each video data stream are intercepted and The captured video frames are uploaded to the cloud synchronously.

云端的服务器集群253会对截取得到的视频帧做后续处理，获得用于播放的多角度自由视角视频。The server cluster 253 in the cloud will perform subsequent processing on the intercepted video frames to obtain a multi-angle free-view video for playback.

在具体实施中，所述云端的服务器集群253可以包括：第一云端服务器2531，第二云端服务器2532，第三云端服务器2533，第四云端服务器2534。其中，所述第一云端服务器2531可以用于参数计算；所述第二云端服务器2532可以用于深度计算，生成深度图；所述第三云端服务器2533可以用于DIBR对预设的虚拟视点路径进行帧图像重建；所述第四云端服务器2534可以用于生成多角度自由视角视频。In a specific implementation, the cloud server cluster 253 may include: a first cloud server 2531 , a second cloud server 2532 , a third cloud server 2533 , and a fourth cloud server 2534 . Wherein, the first cloud server 2531 can be used for parameter calculation; the second cloud server 2532 can be used for depth calculation to generate a depth map; the third cloud server 2533 can be used for DIBR to preset virtual viewpoint path Perform frame image reconstruction; the fourth cloud server 2534 can be used to generate multi-angle free-view video.

可以理解的是，数据处理设备可以根据实际情景置于现场非采集区域，或者置于云端，所述数据同步系统在实际应用中可以采用云端服务器、播放控制设备或者交互终端中的至少一种作为视频帧截取指令的发射端，也可以采用其他能够发射视频帧截取指令的设备，本发明实施例不做限制。It can be understood that the data processing equipment can be placed in the non-acquisition area of the site according to the actual situation, or placed in the cloud. In practical applications, the data synchronization system can use at least one of a cloud server, a playback control device or an interactive terminal. The transmitting end of the video frame interception instruction may also use other devices capable of transmitting the video frame interception instruction, which is not limited in this embodiment of the present invention.

需要说明的是，前述实施例中的数据处理系统等均可以应用本发明实施例中的数据同步系统。It should be noted that the data synchronization system in the embodiment of the present invention can be applied to the data processing system and the like in the foregoing embodiments.

本发明实施例还提供了与上述数据处理方法相应的采集设备，所述采集设备适于在获取到采集起始指令时，将所述采集起始指令同步到其他采集设备，并开始从相应角度实时同步采集视频数据流，以及在接收到数据处理设备发送的拉流指令时，将获得的视频数据流实时传输至所述数据处理设备。为使本领域技术人员更好地理解和实现本发明实施例，以下参照附图，通过具体实施例进行详细介绍。The embodiment of the present invention also provides a collection device corresponding to the above-mentioned data processing method. The video data stream is collected synchronously in real time, and when receiving the streaming instruction sent by the data processing device, the obtained video data stream is transmitted to the data processing device in real time. In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the following describes in detail through specific embodiments with reference to the accompanying drawings.

参照图36所示的采集设备的结构示意图，在本发明实施例中，所述采集设备360包括：光电转换摄像组件361，处理器362、编码器363、传输部件365，其中：Referring to the schematic structural diagram of the acquisition device shown in Figure 36, in the embodiment of the present invention, the acquisition device 360 includes: a photoelectric conversion camera assembly 361, a processor 362, an encoder 363, and a transmission component 365, wherein:

光电转换摄像组件361，适于采集图像；A photoelectric conversion camera assembly 361, suitable for collecting images;

所述处理器362，适于在获取到采集起始指令时，将所述采集起始指令通过传输部件365同步到其他采集设备，并开始将所述光电转换摄像组件361采集到的图像进行实时处理，得到图像数据序列，以及在获取到拉流指令时，通过传输部件365将获得的视频数据流实时传输至数据处理设备；The processor 362 is adapted to synchronize the acquisition initiation instruction to other acquisition devices through the transmission component 365 when the acquisition initiation instruction is obtained, and start to perform real-time processing of the images collected by the photoelectric conversion camera assembly 361. Processing to obtain the image data sequence, and when the streaming instruction is obtained, the video data stream obtained is transmitted to the data processing device in real time through the transmission component 365;

所述编码器363，适于将所述图像数据序列进行编码，获得相应的视频数据流。The encoder 363 is adapted to encode the image data sequence to obtain a corresponding video data stream.

作为一种可选方案，如图36所示，所述采集设备360还可以包括录音组件364，适于采集声音信号，获得音频数据。As an optional solution, as shown in FIG. 36 , the collection device 360 may further include a recording component 364 adapted to collect sound signals and obtain audio data.

通过处理器362可以将采集到的图像数据序列和音频数据进行处理，然后可以通过编码器363将所述采集到的图像数据序列和音频数据进行编码，获得相应的视频数据流。且所述处理器362在获取到采集起始指令时，可以通过传输部件365将所述采集起始指令同步到其他采集设备；在接收到拉流指令时，通过传输部件365将获得的视频数据流实时传输至所述数据处理设备。The collected image data sequence and audio data can be processed by the processor 362 , and then the collected image data sequence and audio data can be encoded by the encoder 363 to obtain a corresponding video data stream. And when the processor 362 obtains the collection start instruction, it can synchronize the collection start instruction to other collection devices through the transmission component 365; The stream is transmitted in real time to the data processing device.

在具体实施中，所述采集设备可以根据预设的多角度自由视角范围置于现场采集区域不同位置，所述采集设备可以固定设置于现场采集区域的某一点，也可以在现场采集区域内移动从而组成采集阵列。因此，所述采集设备可以是固定的设备，也可以是移动的设备，由此可以多角度灵活采集视频数据流。In a specific implementation, the acquisition device can be placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and the acquisition device can be fixed at a certain point in the on-site acquisition area, or can be moved within the on-site acquisition area To form an acquisition array. Therefore, the acquisition device may be a fixed device or a mobile device, so that video data streams may be flexibly collected from multiple angles.

如图37所示，为本发明实施例中一种应用场景中采集阵列的示意图，以舞台中心作为核心看点，以核心看点为圆心，核心看点位于同一平面的扇形区域作为预设的多角度自由视角范围。所述采集阵列中采集设备371-375根据所述预设的多角度自由视角范围，成扇形置于现场采集区域不同位置。采集设备376为可移动设备，可以根据指令移动到指定位置，进行灵活采集。并且，采集设备可以是手持设备，用以在采集设备发生故障时或者在空间狭小区域增补采集数据，例如，图37中位于舞台观众区域的手持设备377可以加入采集阵列中，用以提供舞台观众区域的视频数据流。As shown in Figure 37, it is a schematic diagram of an acquisition array in an application scenario in an embodiment of the present invention, with the center of the stage as the core point of view, the core point of view as the center of the circle, and the fan-shaped area where the core point of view is located on the same plane as the preset Multi-angle free viewing angle range. The collection devices 371-375 in the collection array are fan-shaped and placed at different positions in the on-site collection area according to the preset multi-angle free viewing angle range. The collection device 376 is a mobile device that can be moved to a designated location according to instructions for flexible collection. Moreover, the collection device can be a handheld device, which is used to supplement the collection data when the collection device breaks down or in a small space. For example, the handheld device 377 located in the stage audience area in FIG. 37 can be added to the collection array to provide stage audiences Region's video data stream.

如前所述，为生成多角度自由视角数据，需要进行深度图计算，但目前深度图计算的时间较长，如何减少深度图生成的时间，提升深度图生成速率成为亟待解决的问题。As mentioned above, in order to generate multi-angle free-view data, depth map calculation is required, but currently the depth map calculation takes a long time, how to reduce the time for depth map generation and improve the depth map generation rate has become an urgent problem to be solved.

针对上述问题，本发明实施例提供计算节点集群，多个计算节点可以同时对同一采集阵列同步采集到的纹理数据并行地、批处理式地生成深度图。具体而言，深度图计算过程可以分为为通过第一深度计算得到粗略深度图，以及确定粗略深度图中的不稳定区及之后的第二深度计算等多个步骤，在各步骤中，计算节点集群中的多个计算节点可以并行对多个采集设备采集到的纹理数据进行第一深度计算，得到粗略深度图，以及并行地对得到的粗略深度图进行验证及进行第二深度计算，从而可以节约深度图计算的时间，提升深度图生成速率。以下参照附图，通过具体实施例进一步详细说明。In view of the above problems, embodiments of the present invention provide computing node clusters, and multiple computing nodes can simultaneously generate depth maps in parallel and batch-processed texture data synchronously collected by the same collection array. Specifically, the depth map calculation process can be divided into multiple steps: obtaining a rough depth map through the first depth calculation, determining the unstable area in the rough depth map, and then calculating the second depth. In each step, the calculation Multiple computing nodes in the node cluster can perform the first depth calculation on the texture data collected by multiple acquisition devices in parallel to obtain a rough depth map, and verify and perform the second depth calculation on the obtained rough depth map in parallel, so that It can save the time for calculating the depth map and improve the generation rate of the depth map. Hereinafter, with reference to the accompanying drawings, it will be further described in detail through specific embodiments.

参照图26所示的一种深度图生成方法的流程图，在本发明实施例中，采用计算节点集群中多个计算节点分别进行深度图生成，为描述方便，将所述计算节点集群中任一计算节点称为第一计算节点。以下通过具体步骤对所述计算节点集群的深度图生成方法进行详细说明：Referring to the flowchart of a method for generating a depth map shown in FIG. 26 , in the embodiment of the present invention, multiple computing nodes in the computing node cluster are used to generate the depth map respectively. For the convenience of description, any of the computing node clusters A computing node is referred to as a first computing node. The following is a detailed description of the method for generating the depth map of the computing node cluster through specific steps:

S261，接收纹理数据，所述纹理数据为同一采集阵列中的多个采集设备同步采集。S261. Receive texture data, where the texture data is synchronously collected by multiple collection devices in the same collection array.

在具体实施中，所述多个采集设备可以根据预设的多角度自由视角范围置于现场采集区域不同位置，所述采集设备可以固定设置于现场采集区域的某一点，也可以在现场采集区域内移动从而组成采集阵列。其中，所述多角度自由视角可以是指使得场景能够自由切换的虚拟视点的空间位置以及视角。例如，多角度自由视角可以是6自由度(6DoF)的视角，采集阵列中所采用的采集设备可以为通用的相机、摄像头、录像机、手持设备如手机等，具体实现可以参见本发明其他实施例，此处不再赘述。In a specific implementation, the plurality of acquisition devices can be placed in different positions of the on-site acquisition area according to the preset multi-angle free viewing angle range, and the acquisition devices can be fixed at a certain point in the on-site acquisition area, or can be placed in the on-site acquisition area Move within to form an acquisition array. Wherein, the multi-angle free viewing angle may refer to the spatial position and viewing angle of the virtual viewpoint enabling the scene to be switched freely. For example, the multi-angle free viewing angle can be a viewing angle of 6 degrees of freedom (6DoF), and the acquisition equipment used in the acquisition array can be a general-purpose camera, camera, video recorder, handheld device such as a mobile phone, etc. For specific implementation, please refer to other embodiments of the present invention , which will not be repeated here.

所述纹理数据即前述采集设备采集到的二维图像帧的像素数据，可以为一个帧时刻的图像，也可以为连续或非连续的帧图像形成的视频流对应的帧图像的像素数据。The texture data is the pixel data of the two-dimensional image frame collected by the aforementioned acquisition device, which may be an image at a frame time, or may be the pixel data of a frame image corresponding to a video stream formed by continuous or discontinuous frame images.

S262，所述第一计算节点根据第一纹理数据和第二纹理数据，进行第一深度计算，得到第一粗略深度图。S262. The first calculation node performs a first depth calculation according to the first texture data and the second texture data to obtain a first rough depth map.

这里，为描述更加清楚简洁，将纹理数据中与所述第一计算节点满足预设的第一映射关系的纹理数据称为第一纹理数据；将与所述第一纹理数据的采集设备满足预设的第一空间位置关系的采集设备采集的纹理数据称为第二纹理数据。Here, in order to describe more clearly and concisely, among the texture data, the texture data that satisfies the preset first mapping relationship with the first computing node is referred to as the first texture data; The texture data collected by the collection device of the set first spatial position relationship is called the second texture data.

在具体实施中，可以基于预先设置的第一映射关系表或通过随机映射，得到所述第一映射关系。例如，可以根据计算节点集群中计算节点的数量以及纹理数据对应的采集阵列中采集设备的数量预先分配各计算节点所处理的纹理数据。可以设置专门的分配节点对计算节点集群中各计算节点的计算任务进行分配，分配节点可以基于预先设置的第一映射关系表或通过随机映射，得到所述第一映射关系。例如，若采集阵列中共有40台采集设备，为达到最高的并发处理效率，可以配置40个计算节点，每台采集设备对应一个计算节点。而若仅有20个计算节点，在各计算节点处理能力相同或大致相当的情况下，则为达到最高的并发处理效率及均衡负载的需求，可以设置每个计算节点对应两台采集设备采集到的纹理数据。具体可以设置纹理数据对应的采集设备标识与每个计算节点的标识之间的映射关系，作为所述第一映射关系，并基于所述第一映射关系直接将采集阵列中相应的采集设备采集到的纹理数据分发至对应的计算节点。在具体实施中，也可以随机分配计算任务，将采集阵列中各采集设备采集到的纹理数据随机分配到计算节点集群中的各计算节点上，为此，为提高处理效率，可以提前将采集阵列采集到的所有纹理数据在计算节点集群中的每一个计算节点上均复制一份。In a specific implementation, the first mapping relationship may be obtained based on a preset first mapping relationship table or through random mapping. For example, the texture data processed by each computing node may be pre-allocated according to the number of computing nodes in the computing node cluster and the number of acquisition devices in the acquisition array corresponding to the texture data. A dedicated allocation node may be set up to allocate computing tasks of each computing node in the computing node cluster, and the allocation node may obtain the first mapping relationship based on a preset first mapping relationship table or through random mapping. For example, if there are 40 collection devices in the collection array, in order to achieve the highest concurrent processing efficiency, 40 computing nodes can be configured, and each collection device corresponds to a computing node. However, if there are only 20 computing nodes, and the processing capabilities of each computing node are the same or roughly equivalent, in order to achieve the highest concurrent processing efficiency and balance load requirements, you can set each computing node to correspond to two collection devices to collect texture data. Specifically, the mapping relationship between the acquisition device identifier corresponding to the texture data and the identifier of each computing node can be set as the first mapping relationship, and the corresponding acquisition device in the acquisition array is directly acquired based on the first mapping relationship The texture data of is distributed to the corresponding computing nodes. In the specific implementation, computing tasks can also be assigned randomly, and the texture data collected by each acquisition device in the acquisition array can be randomly assigned to each computing node in the computing node cluster. Therefore, in order to improve processing efficiency, the acquisition array can be allocated in advance All the texture data collected are copied on each computing node in the computing node cluster.

作为一个示例，服务器集群中任一服务器均可以根据所述第一纹理数据以及所述第二纹理数据，进行第一深度计算。As an example, any server in the server cluster may perform the first depth calculation according to the first texture data and the second texture data.

其中，对于所述第一纹理数据和所述第二纹理数据的预设的第一空间位置关系，例如所述第二纹理数据可以为与所述第一纹理数据的采集设备满足预设的第一距离关系的采集设备采集到的纹理数据，或者为与所述第一纹理数据的采集设备满足预设的第一数量关系的采集设备采集到的纹理数据，也可以为与所述第一纹理数据的采集设备满足预设的第一距离关系且满足预设的第一数量关系的采集设备采集到的纹理数据。Wherein, for the preset first spatial position relationship between the first texture data and the second texture data, for example, the second texture data may meet the preset first texture data acquisition device. The texture data collected by a collection device with a distance relationship, or the texture data collected by a collection device that satisfies a preset first quantitative relationship with the collection device of the first texture data, may also be the texture data that is related to the first texture data The texture data collected by the data collection device that meets the preset first distance relationship and the preset first quantitative relationship.

其中，所述第一预设数量可以取1至N-1的任意整数值，N为所述采集阵列中采集设备的总量。在本发明一实施例中，所述第一预设数量取2，从而可以以尽量少的运算量得到尽可能高的图像质量。例如，假设预设的第一映射关系中计算节点9与相机9对应，则可以利用相机9的纹理数据，以及与相机9相邻的相机5、6、7、10、11、12的纹理数据，计算得到相机9的粗略深度图。Wherein, the first preset number can take any integer value from 1 to N-1, and N is the total number of acquisition devices in the acquisition array. In an embodiment of the present invention, the first preset number is 2, so that the highest possible image quality can be obtained with as little computation as possible. For example, assuming that the computing node 9 corresponds to the camera 9 in the preset first mapping relationship, the texture data of the camera 9 and the texture data of the cameras 5, 6, 7, 10, 11, and 12 adjacent to the camera 9 can be used , to obtain a rough depth map of the camera 9 through calculation.

可以理解的是，在具体实施中，所述第二纹理数据也可以是与所述第一纹理数据的采集设备满足其他类型的第一空间位置关系的采集设备采集的数据，例如所述第一空间位置关系还可以是满足预设角度、满足预设相对位置等。It can be understood that, in a specific implementation, the second texture data may also be data collected by a collection device that satisfies another type of first spatial position relationship with the collection device of the first texture data, for example, the first texture data The spatial position relationship may also satisfy a preset angle, satisfy a preset relative position, and the like.

S263，所述第一计算节点将所述第一粗略深度图同步至所述计算节点集群中的其余计算节点，得到粗略深度图集。S263. The first computing node synchronizes the first rough depth map to other computing nodes in the computing node cluster to obtain a rough depth map set.

在经过了深度图粗略计算后所得到的粗略深度图，需要进行交叉验证，来确定每个粗略深度图中的不稳定区域，以在下一步骤中进行精细化的求解。其中，对于粗略深度图集中的任一粗略深度图，需要通过该粗略深度图对应的采集设备周围多个采集设备对应的粗略深度图来进行交叉验证。(典型情况下为待验证的粗略深度图和所有其他采集设备对应的粗略深度图来一起交叉验证)，因此需要将各个计算节点计算得到的粗略深度图分别同步至所述计算节点集群中的其余计算节点，经步骤S263同步后，计算节点集群中各个计算节点均得到计算节点集群中其余计算节点计算得到的粗略深度图，每个服务器得到完全相同的粗略深度图集。The rough depth map obtained after the rough calculation of the depth map needs to be cross-validated to determine the unstable area in each rough depth map, so as to perform a refined solution in the next step. Wherein, for any rough depth map in the rough depth map set, cross-validation needs to be performed through the rough depth maps corresponding to multiple collection devices around the collection device corresponding to the rough depth map. (Typically, the rough depth map to be verified and the rough depth map corresponding to all other acquisition devices are cross-validated together), so the rough depth map calculated by each computing node needs to be synchronized to the rest of the computing node cluster After the computing nodes are synchronized in step S263, each computing node in the computing node cluster obtains a rough depth map calculated by other computing nodes in the computing node cluster, and each server obtains the same rough depth map set.

S264，所述第一计算节点对于所述粗略深度图集中的第二粗略深度图，采用第三粗略深度图进行验证，得到所述第二粗略深度图中的不稳定区域。S264. The first calculation node uses a third rough depth map to verify the second rough depth map in the rough depth map set, and obtains an unstable region in the second rough depth map.

其中，所述第二粗略深度图与所述第一计算节点可以满足预设的第二映射关系；所述第三粗略深度图可以为与所述第二粗略深度图对应的采集设备满足预设的第二空间位置关系的采集设备对应的粗略深度图。Wherein, the second rough depth map and the first computing node may satisfy a preset second mapping relationship; the third rough depth map may be that the acquisition device corresponding to the second rough depth map satisfies a preset A rough depth map corresponding to the acquisition device of the second spatial position relationship.

可以基于预先设置的第二映射关系表或通过随机映射，得到所述第二映射关系。例如，可以根据计算节点集群中计算节点的数量以及纹理数据对应的采集阵列中采集设备的数量预先分配各计算节点所处理的纹理数据。在具体实施中，可以设置专门的分配节点对计算节点集群中各计算节点的计算任务进行分配，分配节点可以基于预先设置的第二映射关系表或通过随机映射，得到所述第二映射关系。设置第二映射关系的具体示例可以参见前述第一映射关系的实现示例。The second mapping relationship may be obtained based on a preset second mapping relationship table or through random mapping. For example, the texture data processed by each computing node may be pre-allocated according to the number of computing nodes in the computing node cluster and the number of acquisition devices in the acquisition array corresponding to the texture data. In a specific implementation, a dedicated allocation node can be set to allocate computing tasks of each computing node in the computing node cluster, and the allocation node can obtain the second mapping relationship based on a preset second mapping relationship table or through random mapping. For a specific example of setting the second mapping relationship, reference may be made to the aforementioned implementation example of the first mapping relationship.

可以理解的是，在具体实施中，所述第二映射关系可以与所述第一映射关系完全对应，也可以与所述第一映射关系不对应。例如在相机数量与计算节点数量相等的情况下，可以按照硬件标识将数据(包括纹理数据、粗略深度图)对应的采集设备与处理数据的计算节点的标识，建立一一对应的第二映射关系。It can be understood that, in a specific implementation, the second mapping relationship may completely correspond to the first mapping relationship, or may not correspond to the first mapping relationship. For example, when the number of cameras is equal to the number of computing nodes, a one-to-one second mapping relationship can be established between the acquisition device corresponding to the data (including texture data and a rough depth map) and the identification of the computing node processing the data according to the hardware identifier .

可以理解的是，这里第一粗略深度图、第二粗略深度图与第三粗略深度图的描述，仅为描述清楚及简洁。在具体实施中，所述第一粗略深度图可以与所述第二粗略深度图相同，也可以不同；所述第三粗略深度图对应的采集设备与所述第二粗略深度图对应的采集设备满足预设的第二空间位置关系即可。It can be understood that the descriptions of the first rough depth map, the second rough depth map and the third rough depth map here are only for clarity and brevity. In a specific implementation, the first rough depth map may be the same as or different from the second rough depth map; the acquisition device corresponding to the third rough depth map is the same as the acquisition device corresponding to the second rough depth map It only needs to satisfy the preset second spatial position relationship.

对于所述第二空间位置关系，作为具体示例，所述第三粗略深度图对应的纹理数据可以为与所述第二粗略深度图对应的采集设备满足预设的第二距离关系的采集设备采集到的纹理数据，或者所述第三纹理深度图对应的纹理数据可以为与所述第二粗略深度图对应的采集设备满足预设的第二数量关系的采集设备采集到的纹理数据，又或者，所述第三粗略深度图对应的纹理数据为与所述第二粗略深度图对应的采集设备满足预设的第二距离关系及第二数量关系的采集设备采集到的纹理数据。For the second spatial position relationship, as a specific example, the texture data corresponding to the third rough depth map may be collected by a collection device corresponding to the second rough depth map that satisfies the preset second distance relationship. The obtained texture data, or the texture data corresponding to the third texture depth map may be the texture data collected by the collection device corresponding to the second rough depth map that satisfies the preset second quantitative relationship, or The texture data corresponding to the third rough depth map is texture data collected by a collection device corresponding to the second rough depth map that satisfies a preset second distance relationship and a second quantity relationship.

其中，所述第二预设数量可以取1至N-1的任意整数值，N为所述采集阵列中采集设备的总量。在具体实施中，所述第二预设数量可以和所述第一预设数量相等，也可以不等。在本发明一实施例中，所述第二预设数量取2，从而可以以尽量少的运算量得到尽可能高的图像质量。Wherein, the second preset number can take any integer value from 1 to N-1, and N is the total number of acquisition devices in the acquisition array. In a specific implementation, the second preset number may be equal to or different from the first preset number. In an embodiment of the present invention, the second preset number is 2, so that the highest possible image quality can be obtained with as little computation as possible.

在具体实施中，所述第二空间位置关系也可以为其他类型的空间位置关系，例如满足预设角度、满足预设相对位置等。In a specific implementation, the second spatial position relationship may also be another type of spatial position relationship, such as satisfying a preset angle, satisfying a preset relative position, and the like.

S265，所述第一计算节点根据所述第二粗略深度图中的不稳定区域、所述第二粗略深度图对应的纹理数据以及所述第三粗略深度图对应的纹理数据，进行第二深度计算，得到对应的精细深度图。S265. The first calculation node performs a second depth calculation according to the unstable region in the second rough depth map, the texture data corresponding to the second rough depth map, and the texture data corresponding to the third rough depth map. Calculate to obtain the corresponding fine depth map.

这里需要说明的是，第二深度计算与第一深度计算的不同之处在于，第二深度计算所选取的第二粗略深度图中的深度图候选值不包含所述不稳定区域的深度值，从而可以排除生成的深度图中的不稳定区域，使得所生成的深度图更加精确，进而可以提升生成的多角度自由视角图像的质量。It should be noted here that the difference between the second depth calculation and the first depth calculation is that the depth map candidate values in the second rough depth map selected by the second depth calculation do not include the depth value of the unstable region, Therefore, unstable regions in the generated depth map can be eliminated, so that the generated depth map is more accurate, and the quality of the generated multi-angle free-view image can be improved.

以一个应用场景示例说明：To illustrate with an example application scenario:

可以由服务器S基于分配的相机M的纹理数据以及与所述相机M满足预设的第一空间位置关系的相机的纹理数据，进行第一轮深度计算(第一深度计算)，得到粗略深度图。The first round of depth calculation (first depth calculation) can be performed by the server S based on the texture data of the allocated camera M and the texture data of the camera that meets the preset first spatial position relationship with the camera M to obtain a rough depth map .

在步骤S264的交叉验证后，可以继续在同一台服务器上，连贯地进行深度图的精细化求解。具体而言，所述服务器S可以将分配的相机M对应的粗略深度图与所有其他粗略深度图的结果进行交叉验证，可以得到相机M对应的粗略深度图中的不稳定区域，之后，服务器S可以将分配的相机M对应的粗略深度图中的不稳定区域、相机M采集到的纹理数据以及相机M周围N个相机的纹理信息，再进行一轮深度图计算(第二深度计算)，即可以得到第一纹理数据(相机M采集到的纹理数据)对应的精细化深度图。After the cross-validation in step S264, the refined solution of the depth map can be continuously performed on the same server. Specifically, the server S can cross-validate the assigned rough depth map corresponding to the camera M with the results of all other rough depth maps, and can obtain the unstable region in the rough depth map corresponding to the camera M. After that, the server S The unstable area in the rough depth map corresponding to the assigned camera M, the texture data collected by the camera M, and the texture information of the N cameras around the camera M can be used for another round of depth map calculation (the second depth calculation), namely A refined depth map corresponding to the first texture data (texture data collected by the camera M) can be obtained.

这里相机M对应的粗略深度图为基于相机M采集到的纹理数据以及与所述相机M满足预设的第一空间位置关系的采集设备采集到的纹理数据，计算得到的粗略深度图。Here, the rough depth map corresponding to the camera M is a rough depth map calculated based on the texture data collected by the camera M and the texture data collected by a collection device satisfying the preset first spatial position relationship with the camera M.

S266，将所述各计算节点得到的精细深度图的精细深度图集作为最终生成的深度图。S266. Use the fine depth atlas set of the fine depth maps obtained by the computing nodes as the finally generated depth map.

采用上述实施例，多个计算节点可以同时对同一采集阵列同步采集到的纹理数据并行地、批处理式地进行深度图生成，因而可以极大地提高深度图生成效率。By adopting the above-mentioned embodiments, multiple computing nodes can simultaneously generate depth maps in parallel and batch-processed texture data synchronously collected by the same collection array, thus greatly improving the efficiency of depth map generation.

此外，采用上述方案通过二次深度计算，排除生成的深度图中的不稳定区域，因此所得到的精细深度图更加精确，进而可以提升生成的多角度自由视角图像的质量。In addition, by using the above scheme to calculate the depth twice, the unstable regions in the generated depth map are eliminated, so the obtained fine depth map is more accurate, and the quality of the generated multi-angle free-view image can be improved.

在具体实施中，根据待处理的纹理数据的数据量的大小及对深度图生成速度的需求，可以选用适当的计算节点集群中计算节点的配置及计算节点的数量。例如，所述计算节点集群可以为由多台服务器组成的服务器集群，所述服务器集群中多台服务器可以集中部署，也可以位于分布式部署。在本发明一些实施例中，所述计算节点集群中部分或全部计算节点设备可以作为本地服务器，或者可以作为边缘节点设备，或者作为云端计算设备。In specific implementation, according to the size of the data volume of the texture data to be processed and the demand for the generation speed of the depth map, an appropriate configuration of computing nodes and the number of computing nodes in the computing node cluster can be selected. For example, the computing node cluster may be a server cluster composed of multiple servers, and the multiple servers in the server cluster may be deployed in a centralized manner or in a distributed manner. In some embodiments of the present invention, part or all of the computing node devices in the computing node cluster may serve as local servers, or as edge node devices, or as cloud computing devices.

又如，所述计算节点集群还可以为多个CPU或GPU形成的计算设备。本发明实施例还提供了一种计算节点，适于与至少另一个计算节点形成计算节点集群，用以生成深度图，参照图27所示的计算节点的结构示意图，计算节点270可以包括：As another example, the computing node cluster may also be a computing device formed by multiple CPUs or GPUs. An embodiment of the present invention also provides a computing node, which is suitable for forming a computing node cluster with at least another computing node to generate a depth map. Referring to the schematic structural diagram of a computing node shown in FIG. 27 , the computing node 270 may include:

输入单元271，适于接收纹理数据，所述纹理数据源自同一采集阵列中的多个采集设备同步采集；The input unit 271 is adapted to receive texture data, where the texture data is acquired synchronously from multiple acquisition devices in the same acquisition array;

第一深度计算单元272，适于根据第一纹理数据和第二纹理数据，进行第一深度计算，得到第一粗略深度图，其中：所述第一纹理数据与所述计算节点满足预设的第一映射关系；所述第二纹理数据为与所述第一纹理数据的采集设备满足预设的第一空间位置关系的采集设备采集的纹理数据；The first depth calculation unit 272 is adapted to perform first depth calculation according to the first texture data and the second texture data to obtain a first rough depth map, wherein: the first texture data and the calculation nodes meet the preset requirements The first mapping relationship; the second texture data is texture data collected by a collection device that satisfies a preset first spatial position relationship with the collection device of the first texture data;

同步单元273，适于将所述第一粗略深度图同步至所述计算节点集群中的其余计算节点，得到粗略深度图集；The synchronization unit 273 is adapted to synchronize the first rough depth map to other computing nodes in the computing node cluster to obtain a rough depth map set;

验证单元274，对于所述粗略深度图集中的第二粗略深度图，适于采用第三粗略深度图进行验证，得到所述第二粗略深度图中的不稳定区域，其中：所述第二粗略深度图与所述计算节点满足预设的第二映射关系；所述第三粗略深度图为与所述第二粗略深度图对应的采集设备满足预设的第二空间位置关系的采集设备对应的粗略深度图；The verification unit 274, for the second rough depth map in the rough depth atlas, is adapted to use the third rough depth map for verification, and obtain the unstable area in the second rough depth map, wherein: the second rough The depth map and the computing node satisfy a preset second mapping relationship; the third rough depth map corresponds to a collection device corresponding to the second rough depth map that satisfies a preset second spatial position relationship rough depth map;

第二深度计算单元275，适于根据所述第二粗略深度图中的不稳定区域、所述第二粗略深度图对应的纹理数据以及所述第三粗略深度图对应的纹理数据，进行第二深度计算，得到对应的精细深度图，其中：第二深度计算所选取的第二粗略深度图中的深度图候选值不包含所述不稳定区域的深度值；The second depth calculation unit 275 is adapted to perform the second calculation according to the unstable region in the second rough depth map, the texture data corresponding to the second rough depth map, and the texture data corresponding to the third rough depth map. Depth calculation to obtain a corresponding fine depth map, wherein: the depth map candidate values in the second rough depth map selected by the second depth calculation do not include the depth value of the unstable region;

输出单元276，适于将所述精细深度图输出，以使所述计算节点集群得到精细深度图集作为最终生成的深度图。The output unit 276 is adapted to output the fine depth map, so that the computing node cluster obtains a fine depth map set as a finally generated depth map.

采用上述计算节点，深度图计算过程可以包括通过第一深度计算得到粗略深度图，以及确定粗略深度图中的不稳定区及之后的第二深度计算等多个步骤，通过上述步骤进行深度图计算，利于多个计算节点分别计算，从而可以提升深度图的生成效率。Using the above calculation node, the depth map calculation process can include multiple steps such as obtaining a rough depth map through the first depth calculation, determining the unstable area in the rough depth map and subsequent second depth calculation, etc., and performing depth map calculation through the above steps , which is beneficial for multiple computing nodes to calculate separately, so that the generation efficiency of the depth map can be improved.

本发明实施例还提供了一种计算节点集群，所述计算节点集群可以包括多个计算节点，所述计算节点集群中多个计算节点可以同时对同一采集阵列同步采集到的纹理数据并行地、批处理式地进行深度图生成。为描述方便，将所述计算节点集群中任一计算节点称为第一计算节点。An embodiment of the present invention also provides a computing node cluster, the computing node cluster may include a plurality of computing nodes, and the multiple computing nodes in the computing node cluster may concurrently collect texture data synchronously acquired by the same acquisition array, Batch-based depth map generation. For convenience of description, any computing node in the computing node cluster is referred to as a first computing node.

在本发明一些实施例中，所述第一计算节点，适于根据接收到的纹理数据中的第一纹理数据和第二纹理数据，进行第一深度计算，得到第一粗略深度图；将所述第一粗略深度图同步至所述计算节点集群中的其余计算节点，得到粗略深度图集；对于所述粗略深度图集中的第二粗略深度图，采用第三粗略深度图进行验证，得到所述第二粗略深度图中的不稳定区域；及根据所述第二粗略深度图中的不稳定区域、所述第二粗略深度图对应的纹理数据以及所述第三粗略深度图对应的纹理数据，进行第二深度计算，得到对应的精细深度图，及将获得的精细深度图输出以使得所述计算节点集群将得到的精细深度图集作为最终生成的深度图；In some embodiments of the present invention, the first calculation node is adapted to perform a first depth calculation according to the first texture data and the second texture data in the received texture data to obtain a first rough depth map; The first rough depth map is synchronized to the remaining computing nodes in the computing node cluster to obtain a rough depth map set; for the second rough depth map in the rough depth map set, the third rough depth map is used for verification, and the obtained An unstable area in the second rough depth map; and according to the unstable area in the second rough depth map, the texture data corresponding to the second rough depth map, and the texture data corresponding to the third rough depth map , performing a second depth calculation, obtaining a corresponding fine depth map, and outputting the obtained fine depth map so that the computing node cluster uses the obtained fine depth map set as a final generated depth map;

其中，所述第一纹理数据与所述第一计算节点满足预设的第一映射关系；所述第二纹理数据为与所述第一纹理数据的采集设备满足预设的第一空间位置关系的采集设备采集的纹理数据；所述第二粗略深度图与所述第一计算节点满足预设的第二映射关系；所述第三粗略深度图为与所述第二粗略深度图对应的采集设备满足预设的第二空间位置关系的采集设备对应的粗略深度图；且第二深度计算所选取的第二粗略深度图中的深度图候选值不包含所述不稳定区域的深度值。Wherein, the first texture data and the first computing node satisfy a preset first mapping relationship; the second texture data satisfy a preset first spatial position relationship with the acquisition device of the first texture data The texture data collected by the collection device; the second rough depth map and the first computing node satisfy a preset second mapping relationship; the third rough depth map is the collection corresponding to the second rough depth map The rough depth map corresponding to the acquisition device whose device satisfies the preset second spatial position relationship; and the depth map candidate values in the second rough depth map selected by the second depth calculation do not include the depth value of the unstable region.

参照图28所示的服务器集群进行深度图处理的示意图，其中，相机阵列中N个相机采集的纹理数据分别输入服务器集群中N台服务器，首先分别进行第一深度计算，得到粗略深度图1～N，之后，各服务器将自身计算得到的粗略深度图分别复制至服务器集群其他的服务器上并实现时间同步，之后，各服务器分别对自身分配的粗略深度图进行验证，并进行第二深度计算，得到精细计算后的深度图，作为服务器集群生成的深度图。由上述计算过程可以看出，服务器集群中各服务器可以并行地对多个相机采集到的纹理数据进行第一深度计算、以及对粗略深度图集中的各粗略深度图进行验证以及第二深度计算，整个深度图生成过程有多台服务器并行进行，因而可以极大地节约深度图计算的时间，提升深度图生成效率。Referring to the schematic diagram of depth image processing by a server cluster shown in FIG. 28 , the texture data collected by N cameras in the camera array are respectively input into N servers in the server cluster, and the first depth calculation is performed respectively to obtain rough depth images 1- N. After that, each server copies the rough depth map calculated by itself to other servers in the server cluster and realizes time synchronization. After that, each server verifies the rough depth map allocated by itself, and performs the second depth calculation. The depth map after fine calculation is obtained as the depth map generated by the server cluster. It can be seen from the above calculation process that each server in the server cluster can perform the first depth calculation on the texture data collected by multiple cameras in parallel, and verify and perform the second depth calculation on each rough depth map in the rough depth atlas. The entire depth map generation process is performed in parallel by multiple servers, which can greatly save the time of depth map calculation and improve the efficiency of depth map generation.

本发明实施例中的计算节点和计算节点集群的具体实现方式和有益效果，可以参见本发明前述实施例中的深度图生成方法，在此不再赘述。For the specific implementation manners and beneficial effects of the computing nodes and computing node clusters in the embodiments of the present invention, reference may be made to the methods for generating depth maps in the foregoing embodiments of the present invention, which will not be repeated here.

服务器集群进而可以将生成的深度图存储，或者根据请求输出至终端设备，以进一步进行虚拟视点图像的生成及展示，此处不再赘述。The server cluster can further store the generated depth map, or output it to the terminal device according to the request, so as to further generate and display the virtual view point image, which will not be repeated here.

本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机指令，所述计算机指令运行时可以执行前述任一实施例所述深度图生成方法的步骤，具体可以参见前述深度图生成方法的步骤，此处不再赘述。An embodiment of the present invention also provides a computer-readable storage medium, on which computer instructions are stored. When the computer instructions are run, the steps of the method for generating a depth map described in any of the foregoing embodiments can be executed. For details, please refer to the foregoing depth map The steps of the generation method will not be repeated here.

另外，目前已知的基于深度图像绘制(Depth-Image-Based Rendering，DIBR)的虚拟视点图像生成方法，难以满足播放中多角度自由视角应用的需求。In addition, currently known methods for generating virtual viewpoint images based on Depth-Image-Based Rendering (DIBR) are difficult to meet the requirements of multi-angle free-viewing applications in playback.

发明人经研究发现，目前的DIBR虚拟视点图像生成方法并发度不高，通常由CPU进行处理，然而，对于每一虚拟视点图像，由于生成方法涉及较多步骤，每一步骤也均较为复杂，所以也比较难以通过并行处理方法来实现。The inventor found through research that the current DIBR virtual view point image generation method has low concurrency and is usually processed by the CPU. However, for each virtual view point image, since the generation method involves many steps, each step is also relatively complicated. So it is more difficult to achieve through parallel processing methods.

为解决上述问题，本发明实施例提供一种可以通过并行处理生成虚拟视点图像的方法，可以使多角度自由视角的虚拟视点图像生成的时效性能大大加速，从而可以满足多角度自由视角视频低时延播放和实时互动的需求，提升用户体验。In order to solve the above problems, the embodiment of the present invention provides a method for generating virtual view point images through parallel processing, which can greatly accelerate the timeliness performance of virtual view point image generation with multi-angle free view angles, thereby meeting the requirements of low time requirements for multi-angle free view point images. To meet the needs of delayed playback and real-time interaction, and improve user experience.

为使本领域技术人员对本发明实施例的目的、特征集优点更加明显易懂，以下结合附图对本发明的具体实施例进行详细的说明。In order to make the purpose, features and advantages of the embodiments of the present invention more obvious to those skilled in the art, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

参照图29所示的虚拟视点图像生成方法的流程图，在具体实施中，可以通过如下步骤生成虚拟视点图像：Referring to the flow chart of the method for generating a virtual viewpoint image shown in FIG. 29 , in a specific implementation, the virtual viewpoint image can be generated through the following steps:

S291，获取多角度自由视角的图像组合、所述图像组合的参数数据以及预设的虚拟视点路径数据，其中，所述图像组合包括多个角度同步的多组存在对应关系的纹理图和深度图。S291. Acquire an image combination of multi-angle free viewing angles, parameter data of the image combination, and preset virtual viewpoint path data, wherein the image combination includes multiple sets of corresponding texture maps and depth maps that are synchronized at multiple angles .

其中，所述多角度自由视角可以是指使得场景能够自由切换的虚拟视点的空间位置及视角。多角度自由视角范围可以根据应用场景的需要确定。Wherein, the multi-angle free viewing angle may refer to the spatial position and viewing angle of the virtual viewing point enabling the scene to be switched freely. The range of multi-angle free viewing angles can be determined according to the needs of the application scenario.

在具体实施中，可以通过在现场布置由多个采集设备组成的采集阵列，所述采集阵列中各采集设备可以根据预设的多角度自由视角范围置于现场采集区域的不同位置，各采集设备可以同步采集现场图像，获得多个角度同步的纹理图。例如，可以通过多个相机、摄像机等对某一场景进行多个角度的同步的图像采集。In a specific implementation, by arranging a collection array composed of multiple collection devices on the site, each collection device in the collection array can be placed in different positions of the on-site collection area according to the preset multi-angle free viewing angle range, each collection device On-site images can be collected synchronously to obtain texture maps synchronized from multiple angles. For example, multiple cameras, video cameras, etc. may be used to collect images of a certain scene at multiple angles synchronously.

所述多角度自由视角的图像组合中的图像，可以为完全的自由视角的图像。在具体实施中，可以为6自由度(Degree of Freedom，DoF)的视角，也即可以自由切换视点的空间位置以及视角。如前所述，视点的空间位置可以表示为坐标(x,y,z)，视角可以表示为三个旋转方向故可以称为6DoF。The images in the multi-angle free viewing angle image combination may be completely free viewing angle images. In a specific implementation, it may be a viewing angle of 6 degrees of freedom (Degree of Freedom, DoF), that is, the spatial position and viewing angle of the viewing point may be freely switched. As mentioned earlier, the spatial position of the viewpoint can be expressed as coordinates (x, y, z), and the viewing angle can be expressed as three rotation directions Therefore, it can be called 6DoF.

在虚拟视点图像生成过程中，可以先获取多角度自由视角的图像组合，以及所述图像组合的参数数据。In the process of generating the virtual view point image, the image combination of multi-angle free viewing angles and the parameter data of the image combination can be obtained first.

在具体实施中，图像组合中的纹理图与深度图一一对应。其中，纹理图可以采用任意类型的二维图像格式，例如可以是BMP、PNG、JPEG、webp格式等其中任意一种格式。深度图可以表示场景中各点相对于拍摄设备的距离，即深度图中每一个像素值表示场景中某一点与拍摄设备之间的距离。In a specific implementation, there is a one-to-one correspondence between the texture map and the depth map in the image combination. Wherein, the texture map may adopt any type of two-dimensional image format, such as any one of BMP, PNG, JPEG, and webp formats. The depth map can represent the distance of each point in the scene relative to the shooting device, that is, each pixel value in the depth map represents the distance between a point in the scene and the shooting device.

图像组合中的纹理图即同步的多个二维图像。可以基于所述多个二维图像，确定每个二维图像的深度数据。A texture map in an image combination is a plurality of synchronized two-dimensional images. Depth data for each two-dimensional image may be determined based on the plurality of two-dimensional images.

其中，深度数据可以包括与二维图像的像素对应的深度值。采集设备到待观看区域中各个点的距离可以作为上述深度值，深度值可以直接反映待观看区域中可见表面的几何形状。例如，深度值可以是待观看区域中各个点沿着相机光轴到光心的距离，相机坐标系的原点可以作为光心。本领域技术人员可以理解的是，该距离，可以是相对数值，多个图像采用同样的基准即可。Wherein, the depth data may include depth values corresponding to pixels of the two-dimensional image. The distance from the acquisition device to each point in the area to be viewed can be used as the above-mentioned depth value, and the depth value can directly reflect the geometric shape of the visible surface in the area to be viewed. For example, the depth value can be the distance from each point in the area to be viewed along the optical axis of the camera to the optical center, and the origin of the camera coordinate system can be used as the optical center. Those skilled in the art can understand that the distance can be a relative value, and the same reference can be used for multiple images.

深度数据可以包括与二维图像的像素一一对应的深度值，或者，可以是对与二维图像的像素一一对应的深度值集合中选取的部分数值。本领域技术人员可以理解的是，深度值集合可以存储为深度图的形式，在具体实施中，深度数据可以是对原始深度图进行降采样后得到的数据，与二维图像(纹理图)的像素一一对应的深度值集合按照二维图像(纹理图)的像素点排布存储的图像形式为原始深度图。The depth data may include depth values that correspond one-to-one to pixels of the two-dimensional image, or may be partial values selected from a set of depth values that correspond one-to-one to pixels of the two-dimensional image. Those skilled in the art can understand that the depth value set can be stored in the form of a depth map. In a specific implementation, the depth data can be the data obtained after downsampling the original depth map, and the two-dimensional image (texture map) The pixel-to-one depth value set is stored according to the pixel arrangement of the two-dimensional image (texture map), and the image form is the original depth map.

在本发明实施例中，可以通过如下步骤获得多角度自由视角的图像组合，以及所述图像组合的参数数据，以下通过具体应用场景进行说明。In the embodiment of the present invention, the image combination of multi-angle free viewing angles and the parameter data of the image combination can be obtained through the following steps, which will be described below through specific application scenarios.

作为本发明一具体实施例，可以包括如下步骤：第一步是采集和深度图的计算，包括三个主要步骤，分别为：多摄像机的视频采集(Multi-camera Video Capturing)、摄像机内外参计算(Camera Parameter Estimation)，以及深度图计算(Depth MapCalculation)。对于多摄像机采集来说，要求各个摄像机采集的视频可以帧级对齐。As a specific embodiment of the present invention, it may include the following steps: the first step is to collect and calculate the depth map, including three main steps, which are respectively: multi-camera video capture (Multi-camera Video Capturing), camera internal and external parameter calculation (Camera Parameter Estimation), and depth map calculation (Depth MapCalculation). For multi-camera capture, it is required that the videos captured by each camera can be aligned at the frame level.

通过多摄像机的视频采集可以得到纹理图(Texture Image)，也即同步的多个图像；通过摄像机内外参计算，可以得到摄像机参数(Camera Parameter)，也即图像组合的参数数据，包括内部参数数据和外部参数数据；通过深度图计算，可以得到深度图(DepthMap)。The texture image (Texture Image), that is, multiple synchronized images can be obtained through multi-camera video acquisition; the camera parameter (Camera Parameter), that is, the parameter data of image combination, including internal parameter data, can be obtained through the calculation of internal and external parameters of the camera and external parameter data; through the calculation of the depth map, the depth map (DepthMap) can be obtained.

图像组合中同步的多组存在对应关系的纹理图和深度图可以拼接在一起，形成一帧拼接图像。拼接图像可以有多种拼接结构。每一帧拼接图像均可以作为一个图像组合。图像组合中多组纹理图及深度图可以按照预设的关系进行拼接及组合排列。具体而言，图像组合的纹理图和深度图根据位置关系可以区分为纹理图区域和深度图区域，纹理图区域分别存储各个纹理图的像素值，深度图区域按照预设的位置关系分别存储各纹理图对应的深度值。纹理图区域和深度图区域可以是连续的，也可以是间隔分布的。本发明实施例中对图像组合中纹理图和深度图的位置关系不做任何限制。Multiple sets of corresponding texture maps and depth maps synchronized in the image combination can be spliced together to form a spliced image. The stitched image can have multiple stitching structures. Each frame of spliced images can be combined as an image. Multiple sets of texture maps and depth maps in the image combination can be spliced and combined according to the preset relationship. Specifically, the texture map and depth map of the image combination can be divided into a texture map area and a depth map area according to the positional relationship. The texture map area stores the pixel values of each texture map respectively, and the depth map area stores each The depth value corresponding to the texture map. The texture map area and the depth map area can be continuous or distributed at intervals. In the embodiment of the present invention, there is no restriction on the positional relationship between the texture map and the depth map in the image combination.

在具体实施中，可以从图像的属性信息中获取到图像组合中各图像的参数数据。其中，所述参数数据可以包括外部参数数据，还可以包括内部参数数据。外部参数数据用于描述拍摄设备的空间坐标及姿态等，内部参数数据用于表述拍摄设备的光心、焦距等拍摄设备的属性信息。内部参数数据还可以包括畸变参数数据。畸变参数数据包括径向畸变参数数据和切向畸变参数数据。径向畸变发生在拍摄设备坐标系转图像物理坐标系的过程中。而切向畸变是发生在拍摄设备制作过程，其是由于感光元平面跟透镜不平行。基于外部参数数据可以确定图像的拍摄位置、拍摄角度等信息。在虚拟视点图像生成中，结合包括畸变参数数据在内的内部参数数据可以使所确定的空间映射关系更加准确。In a specific implementation, the parameter data of each image in the image combination can be obtained from the attribute information of the image. Wherein, the parameter data may include external parameter data, and may also include internal parameter data. The external parameter data is used to describe the spatial coordinates and attitude of the shooting device, and the internal parameter data is used to describe the attribute information of the shooting device such as the optical center and focal length of the shooting device. Internal parameter data may also include distortion parameter data. The distortion parameter data includes radial distortion parameter data and tangential distortion parameter data. Radial distortion occurs in the process of converting the coordinate system of the shooting device to the physical coordinate system of the image. The tangential distortion occurs in the production process of the shooting equipment, which is because the plane of the photosensitive element is not parallel to the lens. Information such as the shooting position and shooting angle of the image can be determined based on the external parameter data. In the generation of virtual viewpoint images, combining internal parameter data including distortion parameter data can make the determined spatial mapping relationship more accurate.

在具体实施中，虚拟视点路径可以预先设置。例如，对于一场体育比赛，如篮球赛或足球赛，可以预先规划好一个弧形路径，例如每当出现一个精彩的镜头，都按照这个弧形路径生成相应的虚拟视点图像。In a specific implementation, the virtual viewpoint path may be preset. For example, for a sports game, such as a basketball game or a football match, an arc path can be planned in advance, for example, whenever a wonderful shot occurs, a corresponding virtual viewpoint image is generated according to the arc path.

在具体应用过程中，可以基于现场中特定的位置或视角(如篮下、场边、裁判视角、教练视角等等)，或者基于特定对象(例如球场上的球员、现场的主持人、观众，以及影视图像中的演员等)设置虚拟视点路径。In the specific application process, it can be based on a specific position or perspective on the scene (such as the basket, the sideline, the referee's perspective, the coach's perspective, etc.), or based on a specific object (such as players on the court, the host on the spot, the audience, and actors in film and television images, etc.) to set the virtual viewpoint path.

所述虚拟视点路径对应的路径数据可以包括路径中一系列的虚拟视点的位置数据。The route data corresponding to the virtual viewpoint route may include position data of a series of virtual viewpoints in the route.

S292，根据所述预设的虚拟视点路径数据及所述图像组合的参数数据，从图像组合中选择虚拟视点路径中各虚拟视点的相应组的纹理图和深度图。S292. According to the preset virtual viewpoint path data and the parameter data of the image combination, select a corresponding group of texture maps and depth maps for each virtual viewpoint in the virtual viewpoint path from the image combination.

在具体实施中，可以根据所述虚拟视点路径数据中各虚拟视点的位置数据及所述图像组合的参数数据，从所述图像组合中选择与各虚拟视点位置满足预设位置关系和/或数量关系的相应组的纹理图和深度图。例如，对于在相机密度较大的虚拟视点位置区域，可以仅选择离所述虚拟视点最近的两个相机拍摄的纹理图及对应的深度图，而在相机密度较小的虚拟视点位置区域，可以选择离所述虚拟视点最近的三个或四个相机拍摄的纹理图及对应的深度图。In a specific implementation, according to the position data of each virtual viewpoint in the virtual viewpoint path data and the parameter data of the image combination, select from the image combination that satisfies the preset positional relationship and/or quantity with each virtual viewpoint position The texture map and depth map of the corresponding set of relations. For example, for a virtual viewpoint position area with a high camera density, only the texture maps and corresponding depth maps captured by the two cameras closest to the virtual viewpoint can be selected; Select texture maps and corresponding depth maps captured by three or four cameras closest to the virtual viewpoint.

在本发明一实施例中，可以分别选择离虚拟视点路径中的各虚拟视点位置最近的2至N个采集设备对应的纹理图和深度图，其中，N为采集阵列中所有采集设备的数量。例如，可以默认选择离各虚拟视点位置最近的两个采集设备对应的纹理图和深度图。在具体实施中，用户可以自己设置所选择的离所述虚拟视点位置最近的采集设备的数量，最大不超过所述图像组合所对应的采集设备的数量。In an embodiment of the present invention, texture maps and depth maps corresponding to 2 to N acquisition devices nearest to each virtual viewpoint position in the virtual viewpoint path may be respectively selected, where N is the number of all acquisition devices in the acquisition array. For example, the texture map and the depth map corresponding to the two capture devices nearest to each virtual viewpoint position may be selected by default. In a specific implementation, the user can set the number of the selected collection devices closest to the virtual viewpoint position by himself, and the maximum number does not exceed the number of collection devices corresponding to the image combination.

采用这一方式，对采集阵列中采集设备的空间位置分布没有特别的要求(例如可以为线状分布、弧形阵列排布，或者是任何不规则的排布形式)，根据获取到的所述虚拟视点位置数据及图像组合对应的参数数据，确定采集设备的实际分布状况，进而采用适应性的策略选择图像组合中相应组的纹理图和深度图的选择，从而可以在减小数据运算量、保证生成的虚拟视点图像质量的情况下，提供较高的选择自由度及灵活性，此外也降低了对采集阵列中采集设备的安装要求，便于适应不同的场地需求及安装易操作性。In this way, there is no special requirement for the spatial position distribution of the acquisition devices in the acquisition array (for example, it can be linear distribution, arc array arrangement, or any irregular arrangement form), according to the acquired The virtual view point position data and the parameter data corresponding to the image combination determine the actual distribution of the acquisition equipment, and then adopt an adaptive strategy to select the texture map and depth map of the corresponding group in the image combination, so as to reduce the amount of data calculation, While ensuring the quality of the generated virtual viewpoint image, it provides a high degree of freedom of choice and flexibility, and also reduces the installation requirements for the acquisition equipment in the acquisition array, which is convenient for adapting to different site requirements and easy installation and operability.

在本发明一实施例中，根据所述虚拟视点位置数据及所述图像组合的参数数据，从图像组合中选择离所述虚拟视点位置最近的预设数量的相应组的纹理图和深度图。In an embodiment of the present invention, according to the virtual viewpoint position data and the parameter data of the image combination, select a preset number of corresponding sets of texture maps and depth maps closest to the virtual viewpoint position from the image combination.

可以理解的是，在具体实施中，也可以采用预设的其他规则从所述图像组合中选择相应组的纹理图和深度图。例如还可以根据虚拟视点图像生成设备的处理能力、或者可以根据用户对生成速度的要求，对生成图像的清晰度要求(如普清、高清或超清，等等)从所述图像组合中选择相应组的纹理图和深度图。It can be understood that, in a specific implementation, other preset rules may also be used to select a corresponding set of texture maps and depth maps from the image combination. For example, it can also be selected from the image combination according to the processing capability of the virtual viewpoint image generation device, or according to the user's requirements for the generation speed and the definition requirements of the generated image (such as normal definition, high definition or ultra-high definition, etc.). Corresponding sets of texture maps and depth maps.

S293，将各虚拟视点的相应组的纹理图和深度图输入至图形处理器中，针对虚拟视点路径中各虚拟视点，以像素点为处理单位，由多个线程分别将选择的图像组合中相应组的纹理图和深度图中的像素点进行组合渲染，得到所述虚拟视点对应的图像。S293, input the corresponding texture map and depth map of each virtual view point into the graphics processor, and for each virtual view point in the virtual view point path, take the pixel as the processing unit, and use multiple threads to combine the selected images correspondingly The pixel points in the texture map and the depth map of the group are combined and rendered to obtain an image corresponding to the virtual viewpoint.

图形处理器(Graphics Processing Unit，GPU)，又称显示核心、视觉处理器，显示芯片等，是一种专门做图像和图形相关运算工作的微处理器，可以配置于个人电脑、工作站、电子游戏机和一些移动终端(如平板电脑、智能手机等)有图像相关运算需求的电子设备中。Graphics Processing Unit (GPU), also known as display core, visual processor, display chip, etc., is a microprocessor that specializes in image and graphics related operations, and can be configured in personal computers, workstations, video games, etc. Computers and some mobile terminals (such as tablet computers, smart phones, etc.) have image-related computing requirements in electronic devices.

为使本领域技术人员更好地理解和实现本发明实施例，以下对本发明一些实施例中采用的一种GPU的架构进行简要介绍。需要说明的是，所述GPU架构仅为具体示例，并不构成对本发明实施例所适用的GPU的限制。In order to enable those skilled in the art to better understand and implement the embodiments of the present invention, the architecture of a GPU used in some embodiments of the present invention is briefly introduced below. It should be noted that the GPU architecture is only a specific example, and does not constitute a limitation on the applicable GPU in this embodiment of the present invention.

在本发明一些实施例中，GPU可以采用统一设备体系结构(Compute UnifiedDevice Architecture，CUDA)并行编程架构对选择的图像组合中相应组的纹理图和深度图中的像素点进行组合渲染。CUDA是一种新的硬件和软件体系结构，用于将GPU上的计算作为数据并行计算设备进行分配和管理，而无须将它们映射至图形应用程序编程接口(Application Programming Interface，API)。In some embodiments of the present invention, the GPU may use a unified device architecture (Compute Unified Device Architecture, CUDA) parallel programming framework to perform combined rendering on pixels in a corresponding group of texture maps and depth maps in the selected image combination. CUDA is a new hardware and software architecture for distributing and managing computations on GPUs as data-parallel computing devices without mapping them to a graphics Application Programming Interface (API).

通过CUDA编程时，GPU可以被视为能够并行执行大量线程的计算设备。它作为主CPU或者主机的协处理器运行，换言之，在主机上运行的应用程序中的数据并行、计算密集型的部分被下放到GPU上。When programmed through CUDA, a GPU can be thought of as a computing device capable of executing large numbers of threads in parallel. It runs as the main CPU or as a coprocessor to the host, in other words, the data-parallel, computationally intensive parts of the application running on the host are offloaded to the GPU.

更确切地说，应用程序中多次执行单独立于不同数据的应用程序的一部分可以隔离到一个函数中，该函数在GPU设备上运行，就像许多不同的线程一样。为此，可以将此类函数编译为GPU设备的指令集，生成的程序(称为内核(Kernel))下载到GPU上。执行内核的线程批处理被组织为线程块(Thread Block)。Rather, parts of an application that execute multiple times independently of different data can be isolated into a single function that runs on the GPU device as if it were many different threads. To this end, such functions can be compiled into an instruction set of a GPU device, and the generated program (called a kernel (Kernel)) is downloaded to the GPU. Batches of threads executing a kernel are organized into thread blocks.

线程块是一批线程，可以通过一些快速共享内存有效地共享数据并同步其执行以协调内存访问来协同。在具体实施中，可以在内核中指定同步点，其线程块中的线程将挂起，直至它们都到达同步点。A thread block is a batch of threads that can be coordinated by sharing data efficiently and synchronizing their execution with some fast shared memory to coordinate memory access. In a specific implementation, a synchronization point can be specified in the kernel, and the threads in its thread block will be suspended until they all reach the synchronization point.

在具体实施中，一个线程块可以包含的线程的最大数量是有限的。但是，执行同一内核的相同维度和大小的块可以批处理到一个块网格(Grid of Thread Blocks)中，以便单个内核调用中可以启动的线程总数要大得多。In a specific implementation, the maximum number of threads that a thread block can contain is limited. However, blocks of the same dimension and size executing the same kernel can be batched into a Grid of Thread Blocks so that the total number of threads that can be launched in a single kernel call is much larger.

由上可知，采用CUDA结构，GPU上可以同时有大量线程并行地进行数据处理，因此可以极大地提高虚拟视点图像生成速度。It can be seen from the above that, with the CUDA structure, a large number of threads can process data in parallel on the GPU at the same time, so the generation speed of virtual viewpoint images can be greatly improved.

为使本领域技术人员更好地理解和实现，以下对组合渲染的每一步骤进行以像素点为单位进行处理的过程进行详细介绍。In order to enable those skilled in the art to better understand and implement, the process of processing each step of combined rendering in units of pixels will be introduced in detail below.

在具体实施中，参照图30所示的GPU进行组合渲染的方法的流程图，步骤S293可以通过如下步骤实现：In specific implementation, referring to the flow chart of the method for combined rendering by GPU shown in FIG. 30 , step S293 can be implemented through the following steps:

S2931，将相应组的深度图并行地进行前向映射，映射至所述虚拟视点上。S2931. Perform forward mapping on the corresponding group of depth maps in parallel to the virtual viewpoint.

深度图的前向映射是将原始相机(采集设备)的深度图通过坐标空间位置的转换映射到虚拟相机的位置，从而得到虚拟相机位置的深度图。具体而言，深度图的前向映射是将原始相机(采集设备)的深度图的每一个像素，按照预设的坐标映射关系，映射到虚拟视点的操作。The forward mapping of the depth map is to map the depth map of the original camera (acquisition device) to the position of the virtual camera through the transformation of the coordinate space position, so as to obtain the depth map of the virtual camera position. Specifically, the forward mapping of the depth map is an operation of mapping each pixel of the depth map of the original camera (acquisition device) to a virtual viewpoint according to a preset coordinate mapping relationship.

在具体实施中，可以在GPU上运行第一核心(Kernel)函数，将相应组的深度图中的像素并行地进行前向映射，映射至对应的虚拟视点位置上。In a specific implementation, the first kernel (Kernel) function may be run on the GPU, and the pixels in the depth map of the corresponding group are forward-mapped in parallel to the corresponding virtual viewpoint positions.

发明人在研究和实践过程中发现，在前向映射过程中，可能存在前背景的遮挡问题，以及映射缝隙效应，影响生成的图像质量。首先，针对前背景遮挡问题，在本发明实施例中，对于多个映射到虚拟视点同一个像素的深度值，可以采用原子操作，取像素值最大的值，得到对应的虚拟视点位置的第一深度图。之后，为改善映射缝隙效应带来的影响，可以基于所述虚拟视点位置的第一深度图，创建所述虚拟视点位置的第二深度图，对于所述第二深度图中的每一个像素并行处理，取所述第一深度图中对应像素位置周围预设区域的像素点的最大值。The inventor found in the process of research and practice that in the process of forward mapping, there may be the problem of occlusion of the foreground and the background, as well as the effect of the mapping gap, which affects the quality of the generated image. First, to solve the problem of foreground and background occlusion, in the embodiment of the present invention, for multiple depth values mapped to the same pixel of the virtual viewpoint, an atomic operation can be used to obtain the maximum value of the pixel value to obtain the first value of the corresponding virtual viewpoint position. depth map. Afterwards, in order to improve the impact of the mapping gap effect, a second depth map of the virtual view point position may be created based on the first depth map of the virtual view point position, and each pixel in the second depth map is parallelized The processing is to obtain the maximum value of the pixel points in the preset area around the corresponding pixel position in the first depth map.

在前向映射过程中，由于每个像素均可以并行处理，因此可以大大加快前向映射处理速度，提升前向映射的时效性能。In the forward mapping process, since each pixel can be processed in parallel, the processing speed of the forward mapping can be greatly accelerated, and the time-effective performance of the forward mapping can be improved.

S2932，对前向映射后的深度图并行地进行后处理。S2932. Perform post-processing on the forward-mapped depth map in parallel.

在前向映射结束后，可以对虚拟视点深度图进行后处理，具体而言，可以在GPU上运行预设的第二核心函数，对前向映射得到的第二深度图中的每一个像素，在所述像素位置周围预设区域进行中值滤波处理。由于可以对第二深度图中的各像素并行地进行中值滤波处理，因而可以大大加快后处理速度，提升后处理的时效性能。After the forward mapping is completed, the virtual viewpoint depth map can be post-processed, specifically, the preset second kernel function can be run on the GPU, and for each pixel in the second depth map obtained by the forward mapping, A median filtering process is performed in a preset area around the pixel position. Since the median filtering process can be performed on each pixel in the second depth map in parallel, the post-processing speed can be greatly accelerated, and the time-effective performance of the post-processing can be improved.

S2933，将相应组的纹理图并行地进行反向映射。S2933. Perform reverse mapping on the corresponding group of texture maps in parallel.

本步骤是将虚拟视点位置根据深度图的值计算出在原始相机纹理图中的坐标，并且通过分像素插值计算计算出相应的值。在GPU中，分像素取值可以直接按照双线性进行插值，因此在本步骤中，只需要根据每个像素计算出的坐标直接在原始相机纹理中取值就可以实现。在具体实施中，可以在GPU上运行预设的第三核心函数，将选择的相应组的纹理图中的像素并行地进行插值运算，即可生成对应的虚拟纹理图。This step is to calculate the coordinates of the virtual viewpoint position in the original camera texture map according to the value of the depth map, and calculate the corresponding value through sub-pixel interpolation calculation. In the GPU, the sub-pixel value can be directly interpolated according to bilinearity, so in this step, it is only necessary to directly obtain the value in the original camera texture according to the coordinates calculated by each pixel. In a specific implementation, the preset third core function may be run on the GPU, and the pixels in the corresponding selected texture map may be interpolated in parallel to generate a corresponding virtual texture map.

通过在GPU上运行第三核心函数，将选择的相应组的纹理图中的像素并行地进行插值运算，生成对应的虚拟纹理图，可以大大加快反向映射的处理速度，提升反向映射的时效性能。By running the third core function on the GPU, the pixels in the texture map of the selected corresponding group are interpolated in parallel to generate the corresponding virtual texture map, which can greatly speed up the processing speed of reverse mapping and improve the timeliness of reverse mapping performance.

S2934，将反向映射后所生成的各虚拟纹理图中的像素并行地进行融合。S2934, merging the pixels in each virtual texture map generated after the reverse mapping in parallel.

在具体实施中，可以在GPU上运行第四核心函数，将反向映射后所生成的各虚拟纹理图中的同一位置的像素，并行地进行加权融合。In a specific implementation, the fourth kernel function may be run on the GPU, and the pixels at the same position in each virtual texture map generated after reverse mapping are weighted and fused in parallel.

在GPU上运行第四核心函数，将反向映射后所生成的各虚拟纹理图中的同一位置的像素，并行地进行加权融合，可以大大地加快虚拟纹理图的融合的速度，提升图像融合的时效性能。Running the fourth core function on the GPU, the pixels in the same position in each virtual texture map generated after reverse mapping are weighted and fused in parallel, which can greatly speed up the fusion of virtual texture maps and improve the efficiency of image fusion. aging performance.

以下通过一个具体示例进行详细说明。A specific example is used below to describe in detail.

在步骤S2931中，对于深度图的前向映射，首先，可以通过GPU的第一Kernel函数来计算每个像素点的投影映射关系。In step S2931, for the forward mapping of the depth map, first, the projection mapping relationship of each pixel can be calculated through the first Kernel function of the GPU.

假设真实相机的图像中某一个像素点(u，v)，首先通过对应相机的透视投影模型，将图像坐标(u，v)变化到相机坐标系下的坐标[X,Y,Z]^T。可以理解的是，针对不同相机的透视投影模型，有不同的转换方法。Assuming a certain pixel point (u, v) in the image of the real camera, first change the image coordinates (u, v) to the coordinates [X, Y, Z] ^T in the camera coordinate system through the perspective projection model of the corresponding camera. It is understandable that there are different conversion methods for perspective projection models of different cameras.

例如，对于透视投影模型：For example, for a perspective projection model:

其中，[u,v,1]^T是像素(u,v)的齐次坐标，[X,Y.Z]^T为(u,v)对应真实物体在相机坐标系中的坐标，f_x、f_y分别是x，y方向的焦距，c_x、c_y分别是x，y方向的光心坐标。Among them, [u, v, 1] ^T is the homogeneous coordinates of the pixel (u, v), [X, YZ] ^T is the coordinates of (u, v) corresponding to the real object in the camera coordinate system, f _x , f _y are the focal lengths in the x and y directions respectively, and c _x and _cy are the optical center coordinates in the x and y directions respectively.

所以，对图像中某一像素点(u,v)，已知像素的深度值Z、对应相机镜头的物理参数(f_x、f_y、c_x、c_y可以从前述图像组合的参数数据中获得)，可以通过上述公式(1)，得到相机坐标系下的对应点的坐标[X,Y.Z]^T。Therefore, for a certain pixel point (u, v) in the image, the depth value Z of the known pixel and the physical parameters of the corresponding camera lens (f _x , f _y , c _x , _cy can be obtained from the parameter data of the aforementioned image combination obtained), the coordinates [X, YZ] ^T of the corresponding point in the camera coordinate system can be obtained through the above formula (1).

在图像坐标系到相机坐标系的转换之后，可以根据三维空间中的坐标变换，将物体在当前相机坐标系下的坐标，变换到虚拟视点所在相机的坐标系中。具体可以采用如下的变换公式：After the transformation from the image coordinate system to the camera coordinate system, the coordinates of the object in the current camera coordinate system can be transformed into the coordinate system of the camera where the virtual viewpoint is located according to the coordinate transformation in the three-dimensional space. Specifically, the following conversion formula can be used:

其中，R₁₂为3x3的旋转矩阵，T₁₂为平移向量。Wherein, R ₁₂ is a 3x3 rotation matrix, and T ₁₂ is a translation vector.

假设变换之后的三维坐标为[X₁,Y₁,Z₁]^T，通过之前从图像坐标系到相机坐标系的描述，应用其反变换，即可得到变换后的虚拟相机三维坐标与虚拟相机图像坐标的对应关系位置。由此，便建立了从真实视点图像到虚拟视点图像之间的点的投影关系。通过对真实视点中每一个像素点进行变换，并做坐标点的取整操作，可以得到虚拟视点图像中的投影深度图。Assuming that the transformed three-dimensional coordinates are [X ₁ , Y ₁ , Z ₁ ] ^T , through the previous description from the image coordinate system to the camera coordinate system, and applying its inverse transformation, the transformed virtual camera three-dimensional coordinates and virtual camera can be obtained The corresponding relationship position of the image coordinates. Thus, the projection relationship of the points from the real viewpoint image to the virtual viewpoint image is established. By transforming each pixel in the real viewpoint and rounding the coordinate points, the projected depth map in the virtual viewpoint image can be obtained.

在建立了原始相机深度图和虚拟相机深度图的点对点的映射关系后，由于在深度图的投影过程中，原始相机的深度图中可能有多个位置映射到虚拟相机深度图中的同一位置，导致存在深度图前向映射过程中的前背景遮挡关系，针对这一问题，在本发明实施例中，可以采用原子操作，取其中最小的深度图作为映射位置的最终结果。如公式(3)所示：After establishing the point-to-point mapping relationship between the original camera depth map and the virtual camera depth map, during the projection process of the depth map, there may be multiple positions in the original camera depth map mapped to the same position in the virtual camera depth map, As a result, there is a foreground and background occlusion relationship in the forward mapping process of the depth map. To solve this problem, in the embodiment of the present invention, an atomic operation can be used to take the smallest depth map as the final result of the mapping position. As shown in formula (3):

Depth(u,v)＝min[Depth_1-N(u,v)] (3)Depth(u,v)=min[Depth _1-N (u,v)] (3)

需要说明的是，深度值最小的值同时也是深度图像素值最大的值，因此，在映射得到的深度图上取像素值最大的值，可以得到对应的虚拟视点位置的第一深度图。It should be noted that the smallest depth value is also the largest pixel value in the depth map. Therefore, the first depth map corresponding to the virtual viewpoint position can be obtained by taking the largest pixel value on the mapped depth map.

在具体实施中，可以在CUDA并行的环境下提供多个点映射的取最大或最小值的操作，具体可以通过调用CUDA具备的原子操作函数atomicMin或者atomicMax来进行。In a specific implementation, the operation of obtaining the maximum or minimum value of multiple point maps can be provided in a CUDA parallel environment, specifically by calling the atomic operation function atomicMin or atomicMax provided by CUDA.

在上述得到第一深度图的过程中，可能会产生缝隙效应，也即有一部分像素点可能会由于映射精度的问题没有覆盖到。针对这种问题，本发明实施例可以对得到的第一深度图进行缝隙掩盖处理。在本发明一实施例中，对所述第一深度图进行一个3*3的缝隙掩盖处理。具体掩盖处理过程如下：In the above process of obtaining the first depth map, a gap effect may occur, that is, some pixels may not be covered due to the problem of mapping accuracy. To solve this problem, the embodiment of the present invention may perform seam masking processing on the obtained first depth map. In an embodiment of the present invention, a 3*3 seam masking process is performed on the first depth map. The specific masking process is as follows:

先创建一个虚拟视点位置的第二深度图，然后，对于所述第二深度图中的每一个像素D(x,y)，都取所述虚拟视点位置的第一深度图中的周围3*3范围内已有像素点D_old(x,y)，取所述第一深度图中周围3*3范围内像素点的最大值，可以通过如下的内核函数操作实现：First create a second depth map of the virtual view point position, and then, for each pixel D(x, y) in the second depth map, take the surrounding 3* in the first depth map of the virtual view point position 3 existing pixel points D_old(x, y), take the maximum value of the pixel points within the surrounding 3*3 range in the first depth map, which can be realized by the following kernel function operation:

D(x,y)＝Max[D_old(X,y)] (4)D(x,y)=Max[D_old(X,y)] (4)

可以理解的是，缝隙掩盖处理过程中周围区域的大小范围可以也可以取其他值，例如5*5。为获得更好的处理效果，具体可以根据经验进行设置。It can be understood that the size range of the surrounding area during the seam masking process may also take other values, such as 5*5. In order to obtain a better processing effect, it can be set according to experience.

对于步骤S2932，在具体实施中，可以对所述虚拟视点位置的第二深度图进行3*3或者5*5的中值滤波。例如，对于3*3的中值滤波，所述GPU的第二核心映射函数可以按照如下公式进行操作：For step S2932, in specific implementation, 3*3 or 5*5 median filtering may be performed on the second depth map of the virtual viewpoint position. For example, for a median filter of 3*3, the second kernel mapping function of the GPU may operate according to the following formula:

在步骤S2933中，在GPU上运行的第三核心函数，将虚拟视点位置根据深度图的值计算出在原始相机纹理图中的坐标，第三核心函数可以执行步骤S2391的逆过程实现。In step S2933, the third kernel function running on the GPU calculates the coordinates of the virtual viewpoint position in the original camera texture map according to the value of the depth map, and the third kernel function can be realized by executing the reverse process of step S2391.

在步骤S2934中，对于虚拟视点位置(x,y)的像素点f(x,y)，可以将所有原始相机映射得到的纹理图的相应位置的像素值根据置信度conf(x,y)进行加权。所述第四核心函数可以采用如下公式进行计算：In step S2934, for the pixel point f(x, y) of the virtual viewpoint position (x, y), the pixel values of the corresponding positions of the texture map obtained by all original camera mappings can be calculated according to the confidence degree conf(x, y). weighted. The fourth core function can be calculated using the following formula:

f(x,y)＝∑conf(x,y)*f(x,y) (6)f(x,y)=∑conf(x,y)*f(x,y) (6)

通过上述步骤S2931～S2934，可以得到虚拟视点图像。在具体实施中，还可以对加权融合后所得到的虚拟纹理图作进一步的处理和优化。例如，可以对加权融合后的纹理图中的各像素并行地进行空洞填补，得到所述虚拟视点对应的图像。Through the above steps S2931 to S2934, a virtual viewpoint image can be obtained. In a specific implementation, further processing and optimization can be performed on the virtual texture map obtained after weighted fusion. For example, hole filling may be performed on each pixel in the weighted and fused texture map in parallel to obtain an image corresponding to the virtual viewpoint.

对于虚拟纹理图的空洞填补，在具体实施中，对于每个像素，可以采用单独的开窗方法来进行并行操作。例如，对于每一个空洞像素，可以开一个N*M大小的窗，之后，根据窗内的非空洞像素值来加权计算出此空洞像素的值。通过如上方法，虚拟视点图像的生成可以完全在GPU上进行并行计算，从而可以使得生成过程得到极大的加速。For the hole filling of the virtual texture map, in a specific implementation, for each pixel, a separate windowing method may be used to perform parallel operations. For example, for each hole pixel, a window of size N*M may be opened, and then the value of the hole pixel is calculated by weighting according to the values of non-hole pixels in the window. Through the above method, the generation of the virtual viewpoint image can be completely processed in parallel on the GPU, so that the generation process can be greatly accelerated.

如图31所示的空洞填补方法的示意图，对于生成的虚拟视点视图G，存在空洞区F，对空洞区F中的像素f1和像素f2分别开矩形窗a和b。之后，对于像素f1，并从矩形窗中的已有非空洞像素获得所有的像素(也可以降采样得到部分像素)，并根据距离加权(或者平均加权)得到空洞区F中像素f1的值。同样地，对于像素f2，采用同样操作，可以得到像素f2的值。在具体实施中，可以在GPU上运行第五核心函数，并行化处理，加速空洞填补的时间。As shown in the schematic diagram of the hole filling method in FIG. 31 , for the generated virtual viewpoint view G, there is a hole area F, and rectangular windows a and b are respectively opened for the pixel f1 and the pixel f2 in the hole area F. After that, for pixel f1, all pixels are obtained from the existing non-hole pixels in the rectangular window (some pixels can also be obtained by down-sampling), and the value of pixel f1 in the hole area F is obtained according to distance weighting (or average weighting). Similarly, for the pixel f2, the value of the pixel f2 can be obtained by using the same operation. In a specific implementation, the fifth kernel function can be run on the GPU to parallelize the processing and speed up the hole filling time.

所述第五核心函数可以采用如下公式进行计算：The fifth core function can be calculated using the following formula:

P(x,y)＝Average[Window(x,y)] (7)P(x,y)=Average[Window(x,y)] (7)

其中，P(x,y)为空洞中某一点的值，Window(x,y)为空洞区中所有已有像素的值(或降采样值)，Average为这些像素的平均(或者加权平均)值。Among them, P(x,y) is the value of a certain point in the hole, Window(x,y) is the value (or downsampled value) of all existing pixels in the hole area, and Average is the average (or weighted average) of these pixels value.

在本发明实施例中，除了以像素为单位并行地进行各虚拟视点位置的虚拟视点图像生成外，为进一步加快虚拟视点路径图像的生成效率，可以将虚拟视点路径中的各虚拟视点的相应组的纹理图和深度图分别输入至多个GPU中，并行地生成多个虚拟视点图像。In the embodiment of the present invention, in addition to generating the virtual viewpoint images of each virtual viewpoint position in parallel in units of pixels, in order to further speed up the generation efficiency of virtual viewpoint path images, the corresponding groups of virtual viewpoints in the virtual viewpoint path can be The texture map and depth map are input to multiple GPUs respectively, and multiple virtual viewpoint images are generated in parallel.

在具体实施中，为进一步提高处理效率，上述各个步骤可以由不同的块网格分别执行。In a specific implementation, in order to further improve processing efficiency, each of the above steps may be performed by different block grids respectively.

参照图32所示的虚拟视点图像生成系统的结构示意图，在本发明实施例中，虚拟视点图像生成系统320可以包括CPU321和GPU322，其中：Referring to the schematic structural diagram of the virtual viewpoint image generation system shown in FIG. 32, in the embodiment of the present invention, the virtual viewpoint image generation system 320 may include a CPU321 and a GPU322, wherein:

CPU 321，适于获取多角度自由视角的图像组合、所述图像组合的参数数据以及预设的虚拟视点路径数据，其中，所述图像组合包括多个角度同步的多组存在对应关系的纹理图和深度图；根据所述预设的虚拟视点路径数据及所述图像组合的参数数据，从图像组合中选择虚拟视点路径中各虚拟视点的相应组的纹理图和深度图；CPU 321, adapted to acquire multi-angle free-view image combination, parameter data of the image combination, and preset virtual viewpoint path data, wherein the image combination includes multiple sets of corresponding texture maps synchronized at multiple angles and a depth map; according to the preset virtual viewpoint path data and the parameter data of the image combination, select a texture map and a depth map corresponding to each virtual viewpoint in the virtual viewpoint path from the image combination;

GPU322，适于针对虚拟视点路径中各虚拟视点，调用相应的核心函数，将选择的图像组合中相应组的纹理图和深度图中的像素点并行地进行组合渲染，得到所述虚拟视点对应的图像。The GPU322 is adapted to call corresponding kernel functions for each virtual viewpoint in the virtual viewpoint path, and combine and render corresponding groups of texture maps and pixels in the depth map in the selected image combination in parallel, so as to obtain an image corresponding to the virtual viewpoint. image.

具体地，所述GPU322适于将相应组的深度图并行地进行前向映射，映射至所述虚拟视点上；对前向映射后的深度图并行地进行后处理；将相应组的纹理图并行地进行反向映射；将反向映射后所生成的各虚拟纹理图中的像素并行地进行融合。Specifically, the GPU322 is adapted to perform forward mapping on the corresponding group of depth maps in parallel to the virtual viewpoint; perform post-processing on the forward-mapped depth maps in parallel; parallelize the corresponding group of texture maps performing reverse mapping; the pixels in each virtual texture map generated after reverse mapping are fused in parallel.

其中，GPU322可以采用前述虚拟视点图像生成方法中的步骤S2931～S2934等，以及空洞填补步骤生成各虚拟视点的虚拟视点图像，具体可以参见前述实施例介绍，此处不再赘述。Wherein, the GPU322 can generate the virtual viewpoint images of each virtual viewpoint by using steps S2931 to S2934 and the hole filling step in the aforementioned method for generating virtual viewpoint images. For details, please refer to the introduction of the foregoing embodiments, which will not be repeated here.

在具体实施中，GPU可以为一个，也可以为多个，如图32所示。In a specific implementation, there may be one GPU or multiple GPUs, as shown in FIG. 32 .

在具体应用中，GPU可以为一个独立的GPU芯片、或者为一个GPU芯片中的一个GPU核心，或者为一台GPU服务器，也可以为多个GPU芯片或多个GPU核心封装而成的GPU芯片，或者为多台GPU服务器组成的GPU集群。In a specific application, the GPU can be an independent GPU chip, or a GPU core in a GPU chip, or a GPU server, or a GPU chip packaged by multiple GPU chips or multiple GPU cores. , or a GPU cluster composed of multiple GPU servers.

相应地，可以将所述虚拟视点路径中的各虚拟视点的相应组的纹理图和深度图分别输入至多个GPU芯片、多个GPU核心，或多台GPU服务器中，并行地生成多个虚拟视点图像。例如，某虚拟视点路径对应的虚拟视点路径数据中共包含20个虚拟视点位置坐标，可以将所述20个虚拟视点位置坐标对应的数据并行地输入多个GPU芯片中，例如共有10个GPU芯片，则可以将所述20个虚拟视点位置坐标对应的数据分两批并行处理，各个GPU芯片又可以以像素为单位并行地生成对应虚拟视点位置的虚拟视点图像，因此可以极大地加快虚拟视点图像的生成速度，提升虚拟视点图像生成的时效性能。Correspondingly, the texture maps and depth maps of the respective groups of virtual viewpoints in the virtual viewpoint path can be respectively input into multiple GPU chips, multiple GPU cores, or multiple GPU servers to generate multiple virtual viewpoints in parallel image. For example, the virtual view point path data corresponding to a certain virtual view point path contains 20 virtual view point position coordinates, and the data corresponding to the 20 virtual view point position coordinates can be input into multiple GPU chips in parallel, for example, there are 10 GPU chips in total. Then the data corresponding to the 20 virtual viewpoint position coordinates can be divided into two batches for parallel processing, and each GPU chip can generate the virtual viewpoint images corresponding to the virtual viewpoint positions in parallel in units of pixels, so the virtual viewpoint images can be greatly accelerated. The generation speed improves the time-sensitive performance of virtual viewpoint image generation.

本发明实施例还提供了一种电子设备，参照图33所示的电子设备的结构示意图，电子设备330可以包括存储器331、CPU 332和GPU333，其中，所述存储器331上存储有可在所述CPU332和GPU333上运行的计算机指令，所述CPU 332和GPU333协同运行所述计算机指令时适于执行本发明前述任一实施例所述的虚拟视点图像生成方法的步骤，具体可以参见前述实施例的详细介绍，此处不再赘述。The embodiment of the present invention also provides an electronic device. Referring to the schematic structural diagram of the electronic device shown in FIG. The computer instructions running on the CPU332 and the GPU333, when the CPU332 and the GPU333 cooperate to run the computer instructions, are suitable for executing the steps of the virtual viewpoint image generation method described in any of the foregoing embodiments of the present invention, for details, please refer to the foregoing embodiments A detailed introduction will not be repeated here.

在具体实施中，所述电子设备可以为一台服务器，也可以为多台服务器组成的服务器集群。In a specific implementation, the electronic device may be a server, or a server cluster composed of multiple servers.

以上各实施例均可以适用于直播场景，在应用过程中可以根据需要将两个或两个以上的实施例结合使用。本领域技术人员可以理解的是，以上实施例方案也不限于直播场景，本发明实施例中的方案中对于视频或图像采集、视频数据流的数据处理以及服务器的图像生成等方案也可以适用于非直播场景的播放需求，如录播、转播以及其他有低时延需求的场景。Each of the above embodiments can be applied to a live broadcast scene, and two or more embodiments can be used in combination as required during the application process. Those skilled in the art can understand that the solutions in the above embodiments are not limited to live broadcast scenarios, and the solutions in the embodiments of the present invention for video or image acquisition, data processing of video data streams, and image generation by servers can also be applied to The playback requirements of non-live scenarios, such as recording, rebroadcasting, and other scenarios with low latency requirements.

本发明实施例中各设备或系统的具体实现方式、工作原理和具体作用及效果，可以参见对应方法实施例中的具体介绍。For the specific implementation, working principles, and specific functions and effects of each device or system in the embodiments of the present invention, refer to the specific introduction in the corresponding method embodiments.

本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机指令，所述计算机指令运行时可以执行本发明上述任一实施例方法的步骤。The embodiment of the present invention also provides a computer-readable storage medium, on which computer instructions are stored, and the steps of the method in any one of the above-mentioned embodiments of the present invention can be executed when the computer instructions are run.

其中，所述计算机可读存储介质可以是光盘、机械硬盘、固态硬盘等各种适当的可读存储介质。所述计算机可读存储介质上存储的指令执行的方法，具体可参照上述各方法的实施例，不再赘述。Wherein, the computer-readable storage medium may be any suitable readable storage medium such as an optical disk, a mechanical hard disk, or a solid-state hard disk. For the method executed by the instructions stored in the computer-readable storage medium, reference may be made to the embodiments of the foregoing methods for details, and details are not repeated here.

本发明实施例还提供了一种服务器，包括存储器和处理器，所述存储器上存储有可在所述处理器上运行的计算机指令，所述处理器运行所述计算机指令时可以执行本发明上述任一实施例所述的方法的步骤。所述计算机指令运行时执行的方法具体实现可以参照上述实施例中的方法的步骤，不再赘述。The embodiment of the present invention also provides a server, including a memory and a processor, the memory stores computer instructions that can be run on the processor, and the processor can execute the above-mentioned server of the present invention when running the computer instructions. The steps of the method described in any embodiment. For the specific implementation of the method executed when the computer instructions are running, reference may be made to the steps of the method in the foregoing embodiments, and details are not repeated here.

虽然本发明披露如上，但本发明并非限定于此。任何本领域技术人员，在不脱离本发明的精神和范围内，均可作各种更动与修改，因此本发明的保护范围应当以权利要求所限定的范围为准。Although the present invention is disclosed above, the present invention is not limited thereto. Any person skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, so the protection scope of the present invention should be based on the scope defined in the claims.

Claims

1. A method of data interaction, comprising:

the method comprises the steps that an interactive terminal acquires a data stream to be played from a play control device in real time and plays and displays the data stream in real time, wherein the data stream to be played comprises video data and interactive identifications, and each interactive identification is associated with a designated frame time of the data stream to be played;

the method comprises the steps that an interactive terminal responds to triggering operation of an interactive identifier, interactive data corresponding to appointed frame moments of the interactive identifier are obtained, the interactive data comprise multi-angle free view angle data, the multi-angle free view angle data are generated by the interactive terminal or a server based on a plurality of received frame images corresponding to the appointed frame moments, the plurality of frame images are obtained by intercepting multi-channel video data streams synchronously collected by a plurality of collecting devices in a collecting array at the appointed frame moments by a data processing device, the data processing device provides the plurality of frame images for the interactive terminal or the server, and the data processing device is arranged in a non-collecting area on site;

The interactive terminal performs image display of the multi-angle free view angles at the appointed frame time based on the interactive data;

the data processing device intercepts multiple paths of video data streams synchronously acquired by a plurality of acquisition devices in an acquisition array at the appointed frame time, and the method comprises the following steps:

the data processing device intercepts frame-level synchronous video frames at the appointed frame time in the multi-path video data stream based on the received video frame interception instruction.

2. The data interaction method according to claim 1, wherein the multi-angle freeview data includes pixel data, depth data, and parameter data of the plurality of frame images, wherein an association exists between the pixel data and the depth data of each frame image.

3. The data interaction method according to claim 1, wherein a plurality of acquisition devices in the acquisition array are placed at different positions in the field acquisition area according to a preset multi-angle free view angle range.

4. The data interaction method according to claim 1, wherein the interaction identifier is generated by the play control device, and the play control device generates the interaction identifier associated with the video frame at the corresponding time in the data stream to be played based on the frame time information of the intercepted video frame from the data processing device.

5. The data interaction method of claim 1, wherein the interaction data further comprises at least one of: the method comprises the steps of on-site analysis data, information data of an acquisition object, information data of equipment associated with the acquisition object, information data of an on-site deployed object and information data of a logo displayed on site.

6. The data interaction method of claim 1, further comprising:

and when the interaction ending signal is detected, switching to a data stream to be played, which is acquired from the playing control equipment in real time, and playing and displaying in real time.

7. The data interaction method according to claim 6, wherein when the interaction end signal is detected, switching to a data stream to be played obtained from the play control device in real time and performing real-time play presentation, includes at least one of:

when receiving an interaction ending operation instruction, switching to a data stream to be played, which is obtained from the playing control equipment in real time, and playing and displaying the data stream in real time;

and when the image display of the multi-angle free view angle at the appointed frame moment is detected to the last image, switching to a data stream to be played, which is acquired from the playing control equipment in real time, and carrying out real-time playing display.

8. The data interaction method according to any one of claims 1 to 7, wherein the performing multi-angle freeview image presentation based on the interaction data includes:

determining a virtual viewpoint according to the interactive operation, wherein the virtual viewpoint is selected from a multi-angle free view angle range, and the multi-angle free view angle range is a range supporting switching view of the virtual viewpoint in a region to be watched;

and displaying an image for viewing the region to be viewed based on the virtual viewpoint, wherein the image is generated based on the interaction data and the virtual viewpoint.

9. A data processing system, comprising: the system comprises an acquisition array, data processing equipment, a server, play control equipment and an interactive terminal; wherein:

the acquisition array comprises a plurality of acquisition devices which are arranged at different positions of an on-site acquisition area according to a preset multi-angle free view angle range, and are suitable for synchronously acquiring multiple paths of video data streams in real time and uploading the video data streams to the data processing device in real time;

the data processing device is arranged in an on-site non-acquisition area, is suitable for intercepting the multipath video data stream at a designated frame time according to a received video frame intercepting instruction, obtains a plurality of frame images corresponding to the designated frame time and frame time information corresponding to the designated frame time, uploads the plurality of frame images at the designated frame time and the frame time information corresponding to the designated frame time to the server, and sends the frame time information at the designated frame time to the play control device;

The server is suitable for receiving the plurality of frame images and the frame time information uploaded by the data processing equipment, generating interaction data for interaction based on the plurality of frame images, wherein the interaction data comprises multi-angle free view angle data, the interaction data is associated with the frame time information, and the multi-angle free view angle data is generated based on the received plurality of frame images corresponding to the designated frame time;

the play control device is adapted to determine a designated frame time corresponding to the frame time information uploaded by the data processing device in a data stream to be played, generate an interactive identifier associated with the designated frame time, and transmit the data stream to be played containing the interactive identifier to the interactive terminal;

the interactive terminal is suitable for playing and displaying the video containing the interactive identifier in real time based on the received data stream to be played, and acquiring the interactive data which is stored in the server and corresponds to the appointed frame moment based on the triggering operation of the interactive identifier so as to display the multi-angle free view image.

10. The data processing system of claim 9, wherein the server is adapted to generate the multi-angle freeview data based on a plurality of received frame images corresponding to the specified frame time, the multi-angle freeview data including pixel data, depth data, and parameter data for the plurality of frame images, wherein an association exists between the pixel data and the depth data for each frame image.

11. The data processing system of claim 9, wherein the plurality of collection devices in the collection array are disposed at different locations in the field collection area according to a preset multi-angle free view range, and the server is disposed in the field non-collection area, the cloud or the terminal.

12. The data processing system according to claim 9, wherein the play control device is adapted to generate the interactive identifier associated with the video frame at the corresponding time in the data stream to be played based on the time of the frame information of the video frame captured by the data processing device.

13. The data processing system according to claim 9, wherein the interactive terminal is further adapted to switch to a data stream to be played acquired in real time from the play control device and to play the presentation in real time when the interactive end signal is detected.

14. A data processing system, comprising: the system comprises an acquisition array, data processing equipment, play control equipment and an interactive terminal; wherein:

The data processing device is arranged in an on-site non-acquisition area, is suitable for intercepting the multipath video data stream at a designated frame time according to a received video frame intercepting instruction, obtains a plurality of frame images corresponding to the designated frame time and frame time information corresponding to the designated frame time, and sends the frame time information of the designated frame time to the play control device;

the interactive terminal is suitable for playing and displaying videos containing the interactive identification in real time based on the received data stream to be played, acquiring a plurality of frame images corresponding to the appointed frame moment of the interactive identification from the data processing equipment based on the triggering operation of the interactive identification, generating interactive data for interaction based on the plurality of frame images, and displaying multi-angle free view images, wherein the interactive data comprises multi-angle free view data, and the multi-angle free view data is generated based on the received plurality of frame images corresponding to the appointed frame moment.

15. An interactive terminal, comprising:

the data stream acquisition unit is suitable for acquiring a data stream to be played from the playing control equipment in real time, wherein the data stream to be played comprises video data and an interactive identifier, and the interactive identifier is associated with a designated frame time of the data stream to be played;

the playing display unit is suitable for playing and displaying the video and the interactive identification of the data stream to be played in real time;

the interactive data acquisition unit is adapted to respond to the triggering operation of the interactive identifier, acquire interactive data corresponding to the appointed frame time, wherein the interactive data comprises multi-angle free view angle data, the multi-angle free view angle data is generated based on a plurality of received frame images corresponding to the appointed frame time, the plurality of frame images are obtained by intercepting multi-channel video data streams synchronously acquired by a plurality of acquisition devices in an acquisition array at the appointed frame time by a data processing device, the data processing device is arranged in an on-site non-acquisition area, and the data processing device intercepts the multi-channel video data streams synchronously acquired by the plurality of acquisition devices in the acquisition array at the appointed frame time, and the interactive data acquisition unit comprises: the data processing equipment intercepts frame-level synchronous video frames at the appointed frame time in the multi-path video data stream based on the received video frame interception instruction;

And the interaction display unit is suitable for displaying the images of the multi-angle free viewing angles at the appointed frame moment based on the interaction data.

16. The interactive terminal of claim 15, further comprising: and the switching unit is suitable for triggering switching to the data stream to be played, which is acquired in real time by the data stream acquisition unit from the play control equipment, when the interaction ending signal is detected, and performing real-time play display by the play display unit.

17. An interactive terminal, comprising: a processor, a network component, a memory and a display component; wherein:

the processor is adapted to acquire a data stream to be played in real time through the network component, and acquire interactive data corresponding to a designated frame time of an interactive identifier in response to a triggering operation of the interactive identifier, wherein the data stream to be played comprises video data and the interactive identifier, the interactive identifier is associated with the designated frame time of the data stream to be played, the interactive data comprises multi-angle free view data, the multi-angle free view data is generated based on a plurality of received frame images corresponding to the designated frame time, the plurality of frame images are acquired by a data processing device by intercepting a plurality of paths of video data streams synchronously acquired by a plurality of acquisition devices in an acquisition array at the designated frame time, and the data processing device intercepts the plurality of paths of video data streams synchronously acquired by the plurality of acquisition devices in the acquisition array at the designated frame time, and the method comprises the following steps:

The data processing equipment is arranged in the on-site non-acquisition area and is used for intercepting the frame-level synchronous video frames at the appointed frame moment in the multi-path video data stream based on the received video frame interception instruction;

the memory is suitable for storing the data stream to be played, which is acquired in real time;

the display component is suitable for displaying the video and the interactive identification of the data stream to be played in real time based on the data stream to be played obtained in real time, and displaying the image of the multi-angle free view angle at the appointed frame moment based on the interactive data.

18. An interactive terminal comprising a memory and a processor, said memory having stored thereon computer instructions executable on said processor, characterized in that said processor executes the steps of the data interaction method according to any of claims 1 to 8 when said computer instructions are executed.

19. A computer readable storage medium having stored thereon computer instructions, which when run perform the steps of the data interaction method of any of claims 1 to 8.