CN111754564B

CN111754564B - Video display method, device, equipment and storage medium

Info

Publication number: CN111754564B
Application number: CN201910245057.0A
Authority: CN
Inventors: 金海善; 李勇; 王威; 裴建军; 王欢; 邹辉; 李睿
Original assignee: Hangzhou Hikvision System Technology Co Ltd
Current assignee: Hangzhou Hikvision System Technology Co Ltd
Priority date: 2019-03-28
Filing date: 2019-03-28
Publication date: 2024-02-20
Anticipated expiration: 2039-03-28
Also published as: CN111754564A

Abstract

The application discloses a video display method, a video display device, video display equipment and a storage medium. The method comprises the following steps: acquiring video images, wherein the video images are obtained by images shot by one or more cameras; determining a plurality of reference objects in the video image; acquiring related data comprising a reference object acquired by different data acquisition modes; determining physical position information of a reference object according to the related data; determining physical position information of a target point in the video image according to the physical position information of the reference object and the position information of the reference object in the video image; and associating basic data of the presentation target point based on the physical position information of the target point in the video image. According to the method and the device, different types of data and applications can be communicated, fusion of multiple data acquisition modes is achieved, video display modes are enriched, and user experience can be improved.

Description

Video display method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a video display method, a device, equipment and a storage medium.

Background

With the continuous development of video applications, video acquisition modes are more and more, and acquired video types are more and more.

Because of the diversity of the acquisition modes and video types, how to display the acquired video becomes a key problem for video application.

Disclosure of Invention

The embodiment of the application provides a video display method, a device, equipment and a storage medium, which can be used for solving the problems in the related art. The technical scheme is as follows:

in one aspect, an embodiment of the present application provides a video display method, where the method includes:

acquiring a video image, wherein the video image is obtained by images shot by one or more cameras;

determining a reference object in the video image, wherein the number of the reference objects is a plurality of;

acquiring related data comprising the reference object acquired by different data acquisition modes;

determining physical position information of the reference object according to the related data;

determining physical position information of a target point in the video image according to the physical position information of the reference object and the position information of the reference object in the video image;

and displaying basic data of the target point in a correlation mode based on the physical position information of the target point in the video image.

In a possible implementation manner of the present application, the acquiring the related data including the reference object acquired by different data acquisition manners includes:

determining a target data acquisition mode according to the application scene displayed by the video and the type of the reference object;

and acquiring the related data comprising the reference object acquired by the target data acquisition mode.

In a possible implementation manner of the present application, the determining, according to the related data, physical location information of the reference object includes:

when the target data acquisition mode is a first data acquisition mode, the related data comprise three-dimensional video obtained according to the first data acquisition mode;

fusing the video image with the three-dimensional video to obtain a fused video to be processed;

determining position information of the reference object in the video to be processed;

and determining the physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed.

and matching and determining the physical position information of the reference object from the related data according to the attribute information of the reference object.

In a possible implementation manner of the present application, the displaying the basic data of the target point based on the physical location information association of the target point in the video image includes:

displaying the video image in a first display area, and displaying the target point information in the video image;

and displaying the basic data of the target point in a second display area.

In a possible implementation manner of the present application, the displaying, in the second display area, the basic data of the target point includes:

and if the basic data of the target point are multiple, performing space-time superposition on the multiple basic data of the target point, and displaying the superposed basic data of the target point in the second display area.

and if the basic data of the target point are multiple, determining the level relation among the multiple basic data of the target point, and displaying the basic data of the target point in the second display area according to the level relation among the multiple basic data in a grading manner.

In a possible implementation manner of the present application, the determining, according to the physical location information of the reference object and the location information of the reference object in the video image, the physical location information of the target point in the video image includes:

Establishing a position mapping model according to the physical position information of the reference object and the position information of the reference object in the video image;

and determining physical position information of a target point in the video image based on the position mapping model.

In one aspect, a video display apparatus is provided, the apparatus comprising:

the first acquisition module is used for acquiring video images, wherein the video images are obtained by images shot by one or more cameras;

the first determining module is used for determining a plurality of reference objects in the video image;

the second acquisition module is used for acquiring related data comprising the reference object acquired by different data acquisition modes;

the second determining module is used for determining the physical position information of the reference object according to the related data;

a third determining module, configured to determine physical location information of a target point in the video image according to physical location information of the reference object and location information of the reference object in the video image;

and the display module is used for displaying basic data of the target point in an associated mode based on the physical position information of the target point in the video image.

In a possible implementation manner of the present application, the second obtaining module is configured to determine a target data acquisition manner according to an application scenario of video display and a type of the reference object; and acquiring the related data comprising the reference object acquired by the target data acquisition mode.

In a possible implementation manner of the present application, when the target data acquisition mode is a first data acquisition mode, the second acquisition module is configured to include, in the related data, a three-dimensional video obtained according to the first data acquisition mode; fusing the video image with the three-dimensional video to obtain a fused video to be processed; determining position information of the reference object in the video to be processed; and determining the physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed.

In a possible implementation manner of the present application, the second obtaining module is configured to determine, in a matching manner, physical location information of the reference object from the related data according to attribute information of the reference object.

In a possible implementation manner of the application, the display module is configured to display the video image in a first display area, and display the target point information in the video image; and displaying the basic data of the target point in a second display area.

In a possible implementation manner of the present application, the display module is configured to perform space-time superposition on the plurality of basic data of the target point if the plurality of basic data of the target point is provided, and display the superimposed basic data of the target point in the second display area.

In a possible implementation manner of the present application, the display module is configured to determine a level relationship between multiple pieces of basic data of the target point if the multiple pieces of basic data of the target point are provided, and in the second display area, display the basic data of the target point in a hierarchical manner according to the level relationship between the multiple pieces of basic data.

In a possible implementation manner of the present application, the third determining module is configured to establish a location mapping model according to physical location information of the reference object and location information of the reference object in the video image; and determining physical position information of a target point in the video image based on the position mapping model.

In one aspect, a computer device is provided, the computer device comprising a processor and a memory having stored therein at least one instruction that when executed by the processor implements a video presentation method as described in any of the above.

In one aspect, a computer readable storage medium having stored therein at least one instruction that when executed implements a video presentation method as described in any of the above.

The technical scheme provided by the embodiment of the application at least brings the following beneficial effects:

after the physical position information of the reference object is determined through the related data including the reference object acquired by different data acquisition modes, the physical position information of the target point in the video image is determined based on the physical position information of the reference object and the position information of the reference object in the video image, so that the basic data of the target point can be displayed in a correlated manner through the physical position information, the data of different types and applications can be communicated, the fusion of multiple data acquisition modes is realized, the video display modes are enriched, and the user experience can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an implementation environment provided by embodiments of the present application;

fig. 2 is a flowchart of a video display method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a video presentation interface according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a video presentation interface according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a video display device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a video display apparatus according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

With the continuous development of video applications, the system is used for digitizing smart cities, deriving a plurality of application ecologies, such as GIS (Geographic Information System ), high-precision GIS, 3D (three-dimensional) GIS, and a map of each government department, etc., and collecting urban data based on uniform geographic space. Urban data has a plurality of types, large data volume and high complexity, so that no proper logic can realize unification. The video intelligent application is more various, virtual reality, augmented reality, mixed reality, various wearable devices, intelligent devices and the like.

Aiming at the situations of various data acquisition modes and diversified video data types, the embodiment of the application provides a video display method which realizes the fusion of various data acquisition modes by communicating different types of data and applications. The method can be applied to the video display system shown in fig. 1. As shown in fig. 1, the system includes: a data acquisition device 11, a data processing device 12 and a video presentation device 13.

Wherein the data acquisition device 11 is used for acquiring video data. For example, the data acquisition device 11 may be a video camera and the acquired data may be a video image. The data acquisition device 11 may also be an unmanned aerial vehicle, which may perform 3D oblique photography. The data acquisition device 11 may also be a data acquisition vehicle equipped with various sensors such as high-precision positioning, laser radar, video, etc., and can acquire a high-precision map. The embodiment of the present application does not limit the kind of the data collection device 11 and the manner in which the data collection device 11 collects data. In addition, in the video display system provided in the embodiment of the present application, the number of the data acquisition devices 11 may be two or more.

Because the high-precision map runs on the road through the vehicle, the video of the target with obvious characteristics along the way is collected, and the GPS position of the target can be extracted, the collected target positions are absolute GPS positions, and the collection range is large. However, the collection range of the high-precision map is limited, and theoretically only road along-road data can be collected, and the three-dimensional feature targets such as buildings are lack of fine description.

The three-dimensional oblique photography carries a group of cameras which are at angles to each other through the unmanned aerial vehicle, and mainly comprises 5 cameras or 7 cameras, wherein one camera irradiates vertically, and the other 4 or 6 cameras irradiate at a certain angle to each other. Therefore, the unmanned aerial vehicle flies along the acquisition area for one circle to acquire 3D data of the area, and 3D video is acquired through modeling. The 3D scene is closer to the real environment, the data volume is large along the way, and the corresponding data can still be acquired in the area without the road. However, the cost of three-dimensional oblique photography is relatively high, problems such as the endurance performance of the unmanned aerial vehicle under the control of ultra-large-range collection, the communication distance and the like cannot reach ideal expectations, the resolution is inconsistent due to the inconsistent flying height of the unmanned aerial vehicle, the precision of a target GPS has a large influence, the position of a collected 3D video target is based on the relative position, and finally the fusion with a unified world coordinate system is difficult.

Based on the above analysis, three-dimensional oblique photography is different from high-precision map acquisition means, technical forms, and costs, and in this regard, the embodiment of the present application processes acquired data through the data processing device 12. For example, aiming at different data acquisition means, the fusion of the data acquired by the plurality of acquisition means is realized based on the attribute of the unique space-time characteristic of the target, the integration of the plurality of acquisition means of different scenes is realized, and the logic of the datamation is opened.

After fusion of data acquired by multiple acquisition means is realized based on unique space-time characteristic attributes of targets, the positions in videos correspond to real positions (namely physical positions), and video display can be performed through video display equipment 13, so that fusion application and display of videos and BIM (Building Information Modeling, building information model), 3D GIS, high-precision GIS, AR (Augmented Reality ), VR (Virtual Reality), wearable equipment and the like are realized.

Next, in combination with the above system, a method for displaying video provided in the embodiment of the present application will be described by the following method embodiments.

The embodiment of the application provides a video display method, which can be implemented by the video processing device shown in fig. 1. As shown in fig. 2, the method includes the following steps.

In step 201, a video image is acquired, the video image being derived from images captured by one or more cameras.

The acquired video image can be used as a basis for subsequent video presentation, and the video image can be obtained through images shot by one or more cameras. For example, an image taken by one camera is acquired, resulting in a video image. Or acquiring images shot by a plurality of cameras, and then splicing the acquired images to obtain a video image.

In step 202, a number of references in the video image is determined.

For the acquired video image, the video image can be subjected to structural processing through an image recognition technology, and a target point in the video image is automatically recognized, wherein the position of the target point can be fixed and can be used for subsequent target points showing associated information. For example, for a road video image, a sign mark, a lane line, a sign, a manhole cover, a camera on a road, a building, or the like in the road video image may be taken as a target point. After identifying the plurality of target points, a reference is selected from the plurality of target points. For example, among the target points, a target point that is easy to perform coordinate conversion is selected as a reference object. The reference may also be selected by a manually specified manner, which is not limited in this embodiment of the application. Since the reference objects can be used to determine the physical position information of other target points in the video image later, the number of the selected reference objects can be plural, and the embodiments of the present application are not limited to at least three as examples.

In step 203, related data including the reference object acquired by different data acquisition modes is acquired.

Since the video image is captured by the camera, the position of each target point in the video image is the relative position in the video image, and the coordinate system of the video captured by the camera is different from the world coordinate system. The physical position information determined by the world coordinate system is used as a unique physical space-time relationship, and various types of data and data display modes of the target points in the video image can be communicated, so that the method provided by the embodiment of the application adopts a mode of determining the physical position information of each target point in the video image, and various types of basic data of the target points are related and displayed based on the physical position information of the target points.

In contrast, in the method provided by the embodiment of the application, the reference object is selected from the video image, and after the physical position information of the reference object in the video image is determined, the physical position information of the target point in the video image is determined through the physical position information of the reference object. When determining the physical position information of the reference object, the method provided by the embodiment of the application acquires the related data comprising the reference object through different data acquisition modes, so that the physical position information of the reference object in the video image is determined through the related data comprising the physical position information. The embodiment of the application does not limit different data acquisition modes, and the data acquisition modes can acquire physical position information of the reference object.

In addition, due to different types of reference objects, applicable data acquisition modes are different, data types which can be acquired by different data acquisition modes are different, and the requirements on acquired data are different in different application scenes of video display. For example, a reference object located at a road position along the road can be acquired by a high-precision map, and since the high-precision map is acquired by an absolute GPS position, i.e., a physical position, the data acquisition method of the high-precision map is applicable to the reference object of this type. The reference object located in an area where the vehicle cannot reach, such as a community and a key place, can be acquired by using a 3D oblique photography method.

Thus, in a possible embodiment of the present application, acquiring the related data including the reference object acquired by different data acquisition modes includes: determining a target data acquisition mode according to the type of the application scene and the reference object displayed by the video; and acquiring the related data including the reference object acquired by the target data acquisition mode. The related data including the reference object includes physical position information of the reference object and attribute information of the reference object other than the physical position information. The attribute information includes, but is not limited to, information such as a length, a width, a size of the reference object, a distance between other reference objects, and an angle of the three-dimensional space, which is not limited in the embodiment of the present application.

For example, in a road scene, a data acquisition method of a high-precision map may be used as the target data acquisition method. For example, the related data of the target points such as a manhole cover, a street lamp, a garbage can, a green belt, a sign post, a circular island and the like on the road are collected, and the related data comprise GPS position data, namely physical position information, of each target point and attribute information of each target point. If the street lamp is a reference object, the related data comprising the street lamp, which are acquired by a data acquisition mode of the high-precision map, can be acquired.

For example, in a region where a vehicle cannot conveniently reach such as a building, community, or a key place, or a region where a three-dimensional space cannot be acquired, the data acquisition system of the 3D oblique photography is used as the target data acquisition system to acquire the related data including the reference object.

Under the data acquisition mode of 3D oblique photography, before unmanned aerial vehicle takes off and gathers data, GPS position through calibration unmanned aerial vehicle in advance, the GPS position of a plurality of calibration points in the collection region, later gather 3D video through 3D oblique photography technique. The GPS positions of other points except a plurality of calibration points calibrated in advance in the 3D video can be obtained through conversion with the GPS of the calibration points and the GPS of the unmanned aerial vehicle, so that the 3D video with the position data is obtained. The 3D video includes GPS position information, i.e., physical position information, of the reference object. In addition, attribute information other than physical position information, such as the length, width, size, distance between other references, and angle of three-dimensional space, may be included.

In step 204, physical location information of the reference object is determined based on the correlation data.

Since the same reference object can appear in the video image or in the data acquired by different data acquisition modes, the attribute information of the same reference object should be consistent. The related data includes the physical position information of the reference object, so that the method provided by the embodiment can match the reference object in the video image with the reference object in the target data acquisition mode according to the attribute information of the reference object. A successful match means the same reference, and thus the physical location information of the reference in the relevant data can be used as the physical location information of the reference in the video image.

In a possible embodiment of the present application, the physical location information of the reference object is determined according to the related data, including but not limited to the following two cases:

first case: when the target data acquisition mode is a first data acquisition mode, the related data comprise three-dimensional video obtained according to the first data acquisition mode; fusing the video image with the three-dimensional video to obtain a fused video to be processed; determining the position information of a reference object in a video to be processed; and determining the physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed.

Taking the data acquisition mode of the first data acquisition mode as the 3D oblique photography as an example, the related data including the reference object acquired by the first data acquisition mode includes the 3D video acquired according to the first data acquisition mode. The video image and the 3D video can be fused through algorithms such as texture mapping and the like, and the fused video to be processed is obtained.

Under the condition, the video to be processed after fusion realizes the correspondence between the 3D video and the video image, and the fusion part of the video to be processed after fusion can display the video of the real-time camera. Furthermore, for fixed-position objects and some areas that do not change over time, the fused content corresponds to the content in the 3D video. After the fused video to be processed is obtained, the position information of the reference object in the video to be processed is determined, and the physical position information of the reference object is determined from the related data according to the position information of the reference object in the video to be processed. For example, the fused video to be processed includes a reference object B, which is also in the 3D video, and thus, the physical location information of the reference object B can be determined from the related data.

Second case: and matching and determining the physical position information of the reference object from the related data according to the attribute information of the reference object.

Taking the data acquisition mode of the high-precision map as an example, the attribute information of the reference object acquired by the data acquisition mode of the high-precision map is matched with the attribute information of the reference object in the video image, and if the matching is successful, the attribute information and the attribute information are considered to be the same reference object. The data of the reference object acquired by the data acquisition mode of the high-precision map comprises the physical position information of the reference object, so that the physical position information of the reference object in the data corresponds to the video image, and the physical position information of the reference object in the video image is obtained.

For example, the attribute information of the reference object a acquired by the data acquisition system of the high-precision map is matched with the attribute information of the reference object a 'in the video image, and if the matching is successful, the reference object a and the reference object a' are considered to be the same reference object. The data related to the reference object a acquired by the data acquisition method of the high-precision map includes the physical position information of the reference object a, so that the physical position information of the reference object a is correlated to the video image to obtain the physical position information of the reference object a' in the video image.

It should be noted that, if the target acquisition mode is a data acquisition mode of 3D oblique photography, a mode of matching and determining physical position information of the reference object from the related data according to attribute information of the reference object is consistent with a mode principle of the high-precision map, which is described above, and will not be repeated herein.

In step 205, the physical position information of the target point in the video image is determined according to the physical position information of the reference object and the position information of the reference object in the video image.

The position information of the reference object in the video image is determined by a coordinate system of the video, the physical position information of the reference object is determined by a world coordinate system, and the mapping relation between the video coordinate system and the world coordinate system can be established based on the physical position information of the reference object and the position information of the reference object in the video image, so that the position mapping model is obtained. The physical location information of the target point in the video image may then be determined based on the mapping relationship. In a possible embodiment of the present application, determining physical location information of a target point in a video image according to physical location information of a reference object and location information of the reference object in the video image includes: establishing a position mapping model according to the physical position information of the reference object and the position information of the reference object in the video image; physical location information of a target point in the video image is determined based on the location mapping model.

It should be appreciated that the video image may be an image taken by a camera, and that if the position of the camera is fixed, there may be a difference between the images taken for different times. For example, taking a captured road image as an example, 2 vehicles may be included in an image captured at 8 points in the morning and 5 vehicles may be included in an image captured at 9 points in the morning. But although the images taken at different times are different, it is possible for a reference fixed on the road to be taken in each image. Therefore, the method provided by the embodiment of the application can be suitable for video images with continuously changing contents, namely, the basic data of the target points with unfixed positions in the video images can be displayed. Because the method provided by the embodiment of the application is applied, the physical position information of each target point in the image can be determined based on the physical position information of the reference object.

In step 206, the base data of the target point is presented in association based on the physical location information of the target point in the video image.

Because the basic data of the target point are carried in the respective service systems, after the physical position information of the target point in the video image is acquired, various basic data of the target point can be associated through the physical position information, so that the basic data of the target point can be associated and displayed during video display. In a possible embodiment of the present application, the basic data of the target point includes, but is not limited to, service data, AR data, BIM data, map data, and the like of the target point, which is not limited in the embodiment of the present application.

In a possible embodiment of the present application, associating basic data of a presentation target point based on physical position information of the target point in a video image includes: displaying a video image in a first display area, and displaying target point information in the video image; and displaying the basic data of the target point in the second display area.

The target point information includes, but is not limited to, content corresponding to the target point, description information of the target point, or identification of the target point, which is not limited in this embodiment of the present application. For example, taking the display interface shown in fig. 3 as an example, the video image displayed in the first display area on the left side includes 3 target points, which are respectively a target point a, a target point B and a target point C. The second display area on the right displays the basic data of the 3 target points. The types of basic data of the 3 target points are different, the basic data of the target point A is underground pipe network BIM information, the basic data of the target point B is AR data acquired based on an image shot at the target point B, and the basic data of the target point C is map information comprising the target point C. The content in the interface shown in fig. 3, in combination with the actual scenario, may be as shown in fig. 4.

If the number of target points is large, or if the basic data of the target points is large, the basic data of all the target points may not be displayed in the second display area. In a possible implementation manner of the application, when the target point information is displayed in the video image, a control is set for each target point, which interaction point control is detected to be selected is detected, and then the basic information of the target point is displayed in the second display area. Or after detecting which interaction point is selected, displaying the basic data identification of the target point, wherein each basic data identification corresponds to one basic data. And displaying the basic data corresponding to the selected basic data identifier in a second display area.

In addition to the above display manner, since there is a space-time relationship between the basic data of some target points, in a possible embodiment of the present application, the basic data of the target points are displayed in the second display area, including: and if the basic data of the target point are multiple, performing space-time superposition on the multiple basic data of the target point, and displaying the superposed basic data of the target point in the second display area.

For example, the target point is a building, and based on the physical location information of the target point, the base data that can be associated with includes 3D, BIM information of the building. In addition, the video shooting device is installed on each floor in the building, so the basic data of the target point can also comprise monitoring videos of each floor in the building. The 3D, BIM information of the building and the monitoring video of each floor in the building have a space-time relationship, so that the basic data can be subjected to space-time superposition. For example, when the second display area displays the 3D, BIM information of the building, a monitoring video of the floor is displayed at a corresponding position of each floor beside the building. In addition, the basic data of the target point may further include service data, for example, the service data includes identity information of a certain resident on layer 9 of the building, and the identity information of the resident is displayed while the building is displayed in a superimposed manner.

In a possible embodiment of the present application, displaying the basic data of the target point in the second display area includes: and if the basic data of the target point are multiple, determining the level relation among the multiple basic data of the target point, and in the second display area, displaying the basic data of the target point in a grading manner according to the level relation among the multiple basic data.

Taking the above target point as a building as an example, the building and each floor have a level relation, only 3D, BIM information of the building is displayed in the second display area, the indication points of each floor are displayed, and the monitoring video of the floors in the building is displayed in a progressive manner based on the indication points, so that the basic data of the target point are displayed in a grading manner according to the level relation among the multiple basic data.

According to the method provided by the embodiment of the application, after the physical position information of the reference object is determined through the related data comprising the reference object collected by different data collection modes, the physical position information of the target point in the video image is determined based on the physical position information of the reference object and the position information of the reference object in the video image, so that the basic data of the target point are displayed in a correlation mode through the physical position information, fusion of multiple data collection modes is realized, different types of data and applications can be communicated, the video display mode is enriched, and user experience can be further improved.

Based on the same technical concept, referring to fig. 5, an embodiment of the present application provides a video display apparatus, which includes:

a first obtaining module 501, configured to obtain a video image, where the video image is obtained from images captured by one or more cameras;

a first determining module 502, configured to determine a number of references in the video image, where the number of references is a plurality of references;

a second obtaining module 503, configured to obtain related data including a reference object collected by different data collection manners;

a second determining module 504, configured to determine physical location information of the reference object according to the related data;

a third determining module 505, configured to determine physical location information of a target point in the video image according to the physical location information of the reference object and the location information of the reference object in the video image;

and a presentation module 506, configured to associate and present the basic data of the target point based on the physical location information of the target point in the video image.

In a possible implementation manner of the present application, the second obtaining module 503 is configured to determine a target data acquisition manner according to an application scenario of video display and a type of a reference object; and acquiring the related data including the reference object acquired by the target data acquisition mode.

In a possible implementation manner of the present application, the second obtaining module 503 is configured to, when the target data acquisition mode is a first data acquisition mode, include, in the related data, a three-dimensional video obtained according to the first data acquisition mode; fusing the video image with the three-dimensional video to obtain a fused video to be processed; determining position information of the reference object in the video to be processed; and determining the physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed.

In a possible implementation manner of the present application, the second obtaining module 503 is configured to determine, according to the attribute information of the reference object, the physical location information of the reference object from the related data in a matching manner.

In a possible implementation manner of the present application, the display module 506 is configured to display a video image in the first display area, and display target point information in the video image; and displaying the basic data of the target point in the second display area.

In a possible implementation manner of the present application, the display module 506 is configured to, if the basic data of the target point is multiple, perform space-time superposition on the multiple basic data of the target point, and display the superimposed basic data of the target point in the second display area.

In a possible implementation manner of the present application, the display module 506 is configured to determine a level relationship between the plurality of basic data of the target point if the plurality of basic data of the target point is multiple, and in the second display area, display the basic data of the target point in a hierarchical manner according to the level relationship between the plurality of basic data.

In a possible implementation manner of the present application, the third determining module 505 is configured to establish a location mapping model according to physical location information of the reference object and location information of the reference object in the video image; physical location information of a target point in the video image is determined based on the location mapping model.

It should be noted that, when the apparatus provided in the foregoing embodiment performs the functions thereof, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the apparatus and the method embodiments provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the apparatus and the method embodiments are detailed in the method embodiments and are not repeated herein.

Fig. 6 is a schematic structural diagram of a video display apparatus according to an embodiment of the present application. The device may be a terminal, for example: smart phones, tablet computers, notebook computers or desktop computers. Terminals may also be referred to by other names as user equipment, portable terminals, laptop terminals, desktop terminals, etc.

Generally, the terminal includes: a processor 601 and a memory 602.

Processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 601 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 601 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 601 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 601 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the video presentation method provided by the method embodiments herein.

In some embodiments, the terminal may further optionally include: a peripheral interface 603, and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 603 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 604, a touch display 605, a camera 606, audio circuitry 607, a positioning component 608, and a power supply 609.

Peripheral interface 603 may be used to connect at least one Input/Output (I/O) related peripheral to processor 601 and memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 601, memory 602, and peripheral interface 603 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 604 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 604 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 604 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 604 may also include NFC (Near Field Communication, short range wireless communication) related circuitry, which is not limited in this application.

The display screen 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 605 is a touch display, the display 605 also has the ability to collect touch signals at or above the surface of the display 605. The touch signal may be input as a control signal to the processor 601 for processing. At this point, the display 605 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 605 may be one, providing a front panel of the terminal; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal or in a folded design; in still other embodiments, the display 605 may be a flexible display, disposed on a curved surface or a folded surface of the terminal. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display 605 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 606 is used to capture images or video. Optionally, the camera assembly 606 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuit 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing, or inputting the electric signals to the radio frequency circuit 604 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones can be respectively arranged at different parts of the terminal. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 607 may also include a headphone jack.

The location component 608 is used to locate the current geographic location of the terminal to enable navigation or LBS (Location Based Service, location-based services). The positioning component 608 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.

The power supply 609 is used to power the various components in the terminal. The power source 609 may be alternating current, direct current, disposable battery or rechargeable battery. When the power source 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal further includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyroscope sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 can detect the magnitudes of accelerations on three coordinate axes of a coordinate system established with the terminal. For example, the acceleration sensor 611 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 601 may control the touch display screen 605 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 611. The acceleration sensor 611 may also be used for the acquisition of motion data of a game or a user.

The gyro sensor 612 may detect a body direction and a rotation angle of the terminal, and the gyro sensor 66 may collect a 3D motion of the user to the terminal in cooperation with the acceleration sensor 611. The processor 601 may implement the following functions based on the data collected by the gyro sensor 612: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

The pressure sensor 613 may be disposed at a side frame of the terminal and/or at a lower layer of the touch screen 605. When the pressure sensor 613 is disposed at a side frame of the terminal, a grip signal of the terminal by a user may be detected, and the processor 601 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 605. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 614 is used for collecting the fingerprint of the user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 614 may be provided on the front, back or side of the terminal. When a physical key or vendor Logo is provided on the terminal, the fingerprint sensor 614 may be integrated with the physical key or vendor Logo.

The optical sensor 66 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the intensity of ambient light collected by optical sensor 66. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 605 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 based on the ambient light intensity collected by the optical sensor 66.

A proximity sensor 616, also referred to as a distance sensor, is typically provided on the front panel of the terminal. The proximity sensor 616 is used to collect the distance between the user and the front face of the terminal. In one embodiment, when the proximity sensor 616 detects a gradual decrease in the distance between the user and the front face of the terminal, the processor 601 controls the touch display 605 to switch from the bright screen state to the off screen state; when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal gradually increases, the processor 601 controls the touch display screen 605 to switch from the off-screen state to the on-screen state.

It will be appreciated by those skilled in the art that the structure shown in fig. 6 is not limiting of the terminal and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.

In an example embodiment, there is also provided a computer device including a processor and a memory having at least one instruction stored therein. The at least one instruction is configured to be executed by one or more processors to implement any of the video presentation methods described above.

In an exemplary embodiment, a computer readable storage medium having stored therein at least one instruction that when executed by a processor of a computer device implements any of the video presentation methods described above is also provided.

Alternatively, the above-described computer-readable storage medium may be a ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, or the like.

It should be understood that references herein to "a plurality" are to two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

The foregoing description of the exemplary embodiments of the present application is not intended to limit the invention to the particular embodiments disclosed, but on the contrary, the intention is to cover all modifications, equivalents, alternatives, and alternatives falling within the spirit and scope of the invention.

Claims

1. A video presentation method, the method comprising:

determining a target data acquisition mode according to an application scene displayed by a video and the type of the reference object, wherein different data acquisition modes are applicable to different reference object types, and the target data acquisition mode can acquire physical position information of the reference object;

acquiring related data comprising the reference object, wherein the related data comprise physical position information of the reference object and attribute information of the reference object except the physical position information, and the related data are acquired by the target data acquisition mode;

establishing a position mapping model according to the physical position information of the reference object and the position information of the reference object in the video image, and determining the physical position information of a target point in the video image based on the position mapping model;

Displaying basic data of a target point in the video image in a related manner based on physical position information of the target point;

the determining the physical position information of the reference object according to the related data comprises the following steps:

fusing the video image and the three-dimensional video to obtain a fused video to be processed, wherein the fused video to be processed realizes the correspondence between the three-dimensional video and the video image;

determining physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed;

or,

when the target data acquisition mode is a data acquisition mode of a high-precision map, determining physical position information of the reference object according to the related data includes:

2. The method of claim 1, wherein the associating presentation of the base data of the target point based on the physical location information of the target point in the video image comprises:

and displaying the basic data of the target point in a second display area.

3. The method of claim 2, wherein the presenting the base data of the target point at the second presentation area comprises:

4. The method of claim 2, wherein the presenting the base data of the target point at the second presentation area comprises:

5. A video presentation device, the device comprising:

the second acquisition module is used for determining a target data acquisition mode according to the application scene displayed by the video and the type of the reference object, different data acquisition modes are applicable to different reference object types, and the target data acquisition mode can acquire the physical position information of the reference object; acquiring related data comprising the reference object, wherein the related data comprise physical position information of the reference object and attribute information of the reference object except the physical position information, and the related data are acquired by the target data acquisition mode;

the second determining module is used for, when the target data acquisition mode is a first data acquisition mode, including a three-dimensional video obtained according to the first data acquisition mode in the related data; fusing the video image and the three-dimensional video to obtain a fused video to be processed, wherein the fused video to be processed realizes the correspondence between the three-dimensional video and the video image; determining position information of the reference object in the video to be processed; determining physical position information of the reference object from the related data according to the position information of the reference object in the video to be processed; or, the second determining module is configured to, when the target data acquisition mode is a data acquisition mode of a high-precision map, determine, in a matching manner, physical location information of the reference object from the related data according to attribute information of the reference object;

The third determining module is used for establishing a position mapping model according to the physical position information of the reference object and the position information of the reference object in the video image, and determining the physical position information of a target point in the video image based on the position mapping model;

6. A computer device comprising a processor and a memory having stored therein at least one instruction which when executed by the processor implements the video presentation method of any of claims 1 to 4.

7. A computer readable storage medium having stored therein at least one instruction that when executed implements the video presentation method of any of claims 1 to 4.