Disclosure of Invention
The object of the present invention is to solve at least one of the technical drawbacks mentioned.
Therefore, the invention aims to provide a virtual-real fusion method and system for an oversized virtual scene and a dynamic shot screen.
In order to achieve the above object, an embodiment of the present invention provides a virtual-real fusion method for an oversized virtual scene and a dynamic shot screen, including the following steps:
step S1, acquiring the extracted shooting video foreground on the view image processing computer;
step S2, modifying the texture coordinates of the video according to the height map data of the video foreground extracted in the step S1 by adopting a parallax masking mapping method so as to enable the original plane video to generate 3D details;
step S3, generating a shading diffuse reflection map of the equal-proportion model in real time;
step S4, using the shading diffuse reflection mapping in the step S3 to perform Alpha mixing calculation to extract the pixel RGB value of the screen shot area in the foreground, and calculating a soft shadow coefficient;
and step S5, adopting a multi-GPU highlight correction bidirectional reflection distribution algorithm, adjusting the illumination intensity of diffuse reflection and specular reflection by combining soft shadow coefficients, and recalculating the pixel RGB values of each object in the extracted foreground.
Further, the method for mapping by using parallax masking includes:
acquiring a height map of the video foreground extracted in the step S1, wherein the height map includes information of the surface height;
the depth of the surface of the height map is cut into multiple layers with equal distance, then the height map is sampled from the topmost layer, each time, the texture coordinate is shifted along the preset direction, if the point is lower than the surface, namely the depth of the current layer is greater than the sampled depth, the inspection is stopped, the texture coordinate sampled at the last time is used as the result, and the difference calculation is carried out on the result, so as to obtain the final mapping result.
Further, in the step S4, the calculating the soft shadow coefficient includes:
1) setting the soft shadow coefficient to be 0 and the iteration step number to be 4;
2) stepping forward along L to Ha, Ha being less than H (TL1), so the point is below the surface, calculating the soft shadow coefficient as Ha-H (TL1), which is the first check, for a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-1.0/4.0), saving the soft shadow coefficient;
3) stepping forward along L to Hb, Hb being less than H (TL2), so the point is below the surface, calculating the soft shadow coefficient as Hb-H (TL2), which is the second check, with a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-2.0/4.0), saving the soft shadow coefficient;
4) step forward along L, this point being above the surface;
5) last step forward along L, this point is also above the surface;
6) the point of iteration is already higher than the horizontal line by 0.0, and the iteration is ended;
7) the largest soft shading coefficient is selected as the final shading coefficient value.
Further, in step S4, the multi-GPU processing is adopted, and includes: and a multi-process distributed GPU is adopted for processing, and a plurality of processes adopt sockets for communication.
Further, in step S5, the method for applying a multi-GPU highlight modified bidirectional reflectance distribution algorithm includes: and (4) correcting the extremely strong highlight by using an approximate surface light source.
The embodiment of the invention also provides a virtual-real fusion system of the super-large virtual scene and the dynamic shot screen, which comprises the following steps: the system comprises virtual reality head-mounted equipment, a tracking positioner, a visual processing computer, a video processing computer, a 3D depth camera, a simulator operating instrument and a display screen, wherein the virtual reality head-mounted equipment is connected with the visual processing computer, the visual processing computer is connected with the video processing computer, and the 3D depth camera is connected with the video processing computer; the simulator operation instrument and the display screen are connected with a view processing computer, the tracking positioner is connected with the view processing computer,
wherein the 3D depth camera is used for acquiring video images,
the tracking positioner calibrates the position of the VR helmet in a physical space relative to the VR positioning camera according to the video image shot by the 3D depth camera, and sets the position of the camera in a virtual space, namely the position of the VR helmet;
the visual image processing computer is used for extracting a shooting video foreground from the video image according to the video image collected by the 3D depth camera;
the video image processing computer is used for acquiring an extracted shot video foreground on the visual image processing computer, modifying video texture coordinates according to height map data of the extracted video foreground by adopting a parallax masking mapping method so as to enable an original plane video to generate 3D details, generating a masking diffuse reflection mapping of an equal proportion model in real time, performing Alpha hybrid calculation by using the masking diffuse reflection mapping to extract pixel RGB values of a screen shot area in the foreground, calculating a soft shadow coefficient, adopting a multi-GPU highlight correction bidirectional reflection distribution algorithm, regulating the illumination intensity of diffuse reflection and specular reflection by combining the soft shadow coefficient, and recalculating the pixel RGB values of each object in the extracted foreground;
the simulator operation instrument and the display screen are used for providing instrument panel buttons for a user to input operation instructions, operation signals can be processed at the vision processing computer, and processing results can be presented to the user at the virtual reality head-mounted equipment.
Further, the video image processing computer parallax occlusion mapping method includes: acquiring a height map of the extracted video foreground, wherein the height map comprises information of the surface height; the depth of the surface of the height map is cut into multiple layers with equal distance, then the height map is sampled from the topmost layer, each time, the texture coordinate is shifted along the preset direction, if the point is lower than the surface, namely the depth of the current layer is greater than the sampled depth, the inspection is stopped, the texture coordinate sampled at the last time is used as the result, and the difference calculation is carried out on the result, so as to obtain the final mapping result.
Further, the video image processing computer calculates soft shading coefficients, including:
1) setting the soft shadow coefficient to be 0 and the iteration step number to be 4;
2) stepping forward along L to Ha, Ha being less than H (TL1), so the point is below the surface, calculating the soft shadow coefficient as Ha-H (TL1), which is the first check, for a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-1.0/4.0), saving the soft shadow coefficient;
3) stepping forward along L to Hb, Hb being less than H (TL2), so the point is below the surface, calculating the soft shadow coefficient as Hb-H (TL2), which is the second check, with a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-2.0/4.0), saving the soft shadow coefficient;
4) step forward along L, this point being above the surface;
5) last step forward along L, this point is also above the surface;
6) the point of iteration is already higher than the horizontal line by 0.0, and the iteration is ended;
7) the largest soft shading coefficient is selected as the final shading coefficient value.
Further, the video image processing computer employs multi-GPU processing, including: and a multi-process distributed GPU is adopted for processing, and a plurality of processes adopt sockets for communication.
Further, the video image processing computer adopts a multi-GPU highlight correction bidirectional reflection distribution algorithm, which comprises the following steps: and (4) correcting the extremely strong highlight by using an approximate surface light source.
According to the virtual-real fusion method and system of the super-large virtual scene and the dynamic screen shot, the following technologies are adopted: the method comprises the steps of detecting shadows on the whole shot video frame based on a model and shadow attributes, eliminating the video shadows in real time based on Gaussian model color consistency, simulating a three-dimensional engine, carrying out illumination calculation and BRDF highlight correction and performance optimization on a three-dimensional model corresponding to a video extraction foreground based on a physical Bidirectional Reflectance Distribution (BRDF) model, achieving the basic requirement of average 60 frames/second for mediating real super-large scenes (100 x 100 square kilometers) under 3 or more GPUs, and achieving the effect of shooting videos and generating the same shadow effect with a virtual view under the effects of directional light sources, point light sources, depth of field, dynamic blur and hard shadows of the fused image frames.
The invention can reach the basic requirement of average 60 frames/second of the mediated reality super large scene (100 × 100 square kilometers) under 3 and more than multiple GPUs, and can fuse the video content of the dynamic screen shot. The fused image frame can achieve the same light and shadow effect of shooting video and virtual visual under the effects of directional light sources, point light sources, depth of field, dynamic blur, hard shadow and the like.
The virtual-real fusion method and system for the oversized virtual scene and the dynamic screen shot in the embodiment of the invention have the following characteristics:
(1) based on a very large virtual landscape, rather than a small indoor scene.
(2) Based on physical rendering, an empirical formula (such as Phong) is pieced together differently than the rendering models used previously.
(3) Different from the mode of acquiring the light parameters of the physical space in the augmented reality and then calculating the shadow of the virtual model, the method simulates real illumination by performing physical rendering according to ray tracing coloring in the virtual scene without designing a separate algorithm to realize the shadow.
(4) The method is suitable for processing the condition that the video content has dynamic pictures of other displays, namely, a shooting screen.
(5) The variable or non-variable regions can be distinguished and processed using different algorithms.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The invention provides a virtual-real fusion method and system for an oversized virtual scene and a dynamic camera. The invention renders the whole three-dimensional scene by adopting materials based on physics, and makes an equal-proportion model and a mapping for the foreground of the shot video. The invention considers that the dynamic picture part of other displays in the video content is a screen shot area, and is compiled based on the situation that the key problem of geometric consistency in virtual-real fusion is solved, and geometric consistency is not explained.
As shown in fig. 1, the virtual-real fusion method for an oversized virtual scene and a dynamic shot screen according to the embodiment of the present invention includes the following steps:
in step S1, the shot video foreground that has been extracted on the scene image processing computer is acquired.
And step S2, modifying the texture coordinates of the video according to the height map data of the video foreground extracted in the step S1 by adopting a parallax masking mapping method, so that the original plane video generates 3D details.
The method adopts a parallax masking mapping method (mapped difference), and comprises the following steps:
a height map of the video foreground extracted in step S1 is obtained, wherein the height map includes information of the surface height.
The depth of the surface of the height map is cut into multiple layers with equal distance, then the height map is sampled from the topmost layer, each time, the texture coordinate is shifted along the preset direction, if the point is lower than the surface, namely the depth of the current layer is greater than the sampled depth, the inspection is stopped, the texture coordinate sampled at the last time is used as the result, and the difference calculation is carried out on the result, so as to obtain the final mapping result.
In particular, disparity mapping is an enhanced version of normal mapping in computer graphics, which not only changes the way illumination acts, but also creates the illusion of 3D detail on flat polygons. No additional primitives are generated. The disparity mapping does not shift the original primitive but the texture coordinates used to obtain the color and normal.
To implement the disparity mapping, a height map is required. Each pixel in the height map contains information on the height of the surface. The height in the texture is translated into information on how much the corresponding point sinks into the surface. The values read in the height map are reversed, with black (0) representing the height level with the surface and white (1) representing the deepest depression value.
The disparity occlusion map (POM) is another improved version of the steep disparity map.
Steep disparity mapping, unlike simple disparity mapping approximations, not just simply shifting the texture coordinates without checking for justification and relevance, checks whether the result is close to the correct value. The core idea of this method is to cut the depth of the surface into equidistant layers. The height map is then sampled starting from the topmost layer, each time shifting the texture coordinates in the direction of V. If the point is already below the surface (the depth of the current layer is greater than the sampled depth), the examination is stopped and the texture coordinates of the last sample are used as the result.
The operation of the steep disparity mapping is exemplified in the following figures. The depth is divided into 8 layers, each having a height value of 0.125. The texture coordinate offset for each layer is v.xy/V.z scales/numLayers. Referring to FIG. 3a, starting with the top T0 square, the manual calculation steps are as follows:
1. the depth of the layer is 0 and the height map depth H (T0) is about 0.75. The depth sampled is greater than the depth of the layer so the next iteration starts.
2. The texture coordinates are shifted along the V direction and the next layer is selected. The depth of layer is 0.125 and the height map depth H (T1) is approximately 0.625. The depth sampled is greater than the depth of the layer so the next iteration starts.
3. The texture coordinates are shifted along the V direction and the next layer is selected. The depth of layer is 0.25 and the height map depth H (T2) is about 0.4. The depth sampled is greater than the depth of the layer so the next iteration starts.
4. The texture coordinates are shifted along the V direction and the next layer is selected. The depth of layer is 0.375 and the height map depth H (T3) is approximately 0.2. The sampled depth is less than the depth of the layer so the current point on vector V is below the surface. The texture coordinate Tp thus found is T3, which is an approximate point of the actual intersection.
The parallax occlusion mapping simply interpolates the results of the steep parallax mapping.
Fig. 3b shows the corresponding flow of manual calculation steps:
1.nextHeight=H(T3)-currentLayerHeight
2.prevHeight=H(T2)-(currentLayerHeight-layerHeight)
3.weight=nextHeight/(nextHeight-prevHeight)
4.Tp=T(T2)weight+T(T3)(1.0-weight)
there is also a relief disparity mapping in the disparity mapping method that uses a binary search method to improve the result accuracy, but the search degrades the program performance. The parallax masking mapping achieves better results than the steep parallax mapping at better performance than the relief parallax mapping. Although the parallax occlusion map is easier to skip small details in the height map than the relief parallax map, the parallax occlusion map may produce good results using relatively few sampling times.
And step S3, generating a shading diffuse reflection map of the equal-proportion model in real time.
In particular, occlusion diffuse reflection is a modification of diffuse reflected light that can be used as a light model for the edges of an object. It is useful for simulating subsurface scattering.
And step S4, performing Alpha mixing calculation by using the shading diffuse reflection mapping in the step S3 to extract the pixel RGB value of the screen shot area in the foreground, and calculating a soft shadow coefficient.
Specifically, in step S4, soft shading coefficients are calculated, including:
1) setting the soft shadow coefficient to be 0 and the iteration step number to be 4;
2) stepping forward along L to Ha, Ha being less than H (TL1), so the point is below the surface, calculating the soft shadow coefficient as Ha-H (TL1), which is the first check, for a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-1.0/4.0), saving the soft shadow coefficient;
3) stepping forward along L to Hb, Hb being less than H (TL2), so the point is below the surface, calculating the soft shadow coefficient as Hb-H (TL2), which is the second check, with a total number of checks of 4, calculating the distance effect, multiplying the soft shadow coefficient by (1.0-2.0/4.0), saving the soft shadow coefficient;
4) step forward along L, this point being above the surface;
5) last step forward along L, this point is also above the surface;
6) the point of the iteration is already above the horizontal line 0.0 and the iteration is ended.
Specifically, the soft shadow accounts for multiple values along the light source vector L, which are only included at points below the surface. The coefficients of the soft shadow are derived from the difference between the current layer depth and the current point height map depth. To calculate the final shading coefficient, the largest soft shading coefficient needs to be selected. This results in the formula for calculating the soft shadow coefficients:
SF=max(PSF)
referring to fig. 4, there is a calculation step of soft shading coefficients (corresponding to a picture):
1. the soft shadow coefficient is set to 0 and the number of iteration steps is 4.
2. Step forward along L to Ha. Ha is less than H (TL1) so the point is below the surface. The soft shadow coefficient was calculated as Ha-H (TL 1). This is the first check, the total number of checks is 4, the distance effect is calculated, and the soft shadow coefficient is multiplied by (1.0-1.0/4.0). This soft shadow coefficient is saved.
3. Step forward along L to Hb. Hb is less than H (TL2) so this point is below the surface. The soft shadow coefficient was calculated as Hb-H (TL 2). This is checked a second time, for a total of 4 checks, the distance effect is calculated, and the soft shadow coefficient is multiplied by (1.0-2.0/4.0). This soft shadow coefficient is saved.
4. Stepping forward along L, this point is above the surface.
5. The last step forward along L, this point is also above the surface.
6. The point of the iteration is already above the horizontal line 0.0 and the iteration is ended.
7. The largest soft shading coefficient is selected as the final shading coefficient value.
The present invention distinguishes between variable or non-variable regions: the model uses a specific material or materials and uses the same Shader program. Thus, each pixel value within the view frustum can be obtained in the fragment shader. The variable region (such as a display picture) adopts black processing in the texture of the model material, so that the variable region and the non-variable region can be better distinguished.
And step S5, adopting a multi-GPU highlight (spatial) correction Bidirectional Reflectance Distribution (BRDF) algorithm, adjusting the illumination intensity of diffuse reflection and specular reflection (highlight) by combining soft shadow coefficients, and recalculating the pixel RGB values of each object in the extracted foreground.
Specifically, the multi-GPU processing is adopted, and comprises the following steps: and a multi-process distributed GPU is adopted for processing, and a plurality of processes adopt sockets for communication.
Most modern motherboards support or are connected to the system's CPU through a PCI-Express bus. It would be highly desirable to speed up the rendering process with each GPU available on the system. The new memoryless algorithm is well suited for parallelized implementations, as each iteration can be performed completely independently of any other iteration.
And the multiple processes can communicate by adopting a socket. And different processing hosts are connected by adopting a gigabit network card. The method for adopting the multi-GPU highlight correction bidirectional reflection distribution algorithm comprises the following steps: and (4) correcting the extremely strong highlight by using an approximate surface light source.
Bidirectional Reflectance Distribution (BRDF) algorithm: the basis for most physical-based specular BRDFs is the micro-planar (micro) theory. This theory is used to describe reflections from general surfaces (which are not optically smooth). The basic assumption of the micro-planar theory is that the surface is made up of many micro-planes, and that each micro-plane is optically smooth.
Each micro-plane reflects light of one incident direction to a separate exit direction, depending on the micro-plane normal m. When calculating the BRDF, both the light source direction l and the sight line direction v are given. This means that of all micro-planes on the surface, only the part that just reflects i to v contributes to the BRDF. Referring to fig. 5, it can be seen that the surface normal m to these effective micro-planes is exactly in the middle of l and v, i.e. h, and not all micro-planes with m ═ h contribute to reflection. Some micro-planes may be blocked by other micro-planes in the light source direction l (shadowing), some may be blocked in the line of sight direction v (masking), and so on. The micro-planar theory assumes that all occluded does not contribute to the BRDF. The specific BRDF of the micro-plane can be expressed as:
f (l, h) is the Fresnel (Fresnel) reflection generated by the effective micro-plane (m ═ h). G (l, v, h) is the proportion of the effective micro-plane that is not shadowed or mask. D (h) is the normal distribution function of the micro-plane, or the density of the micro-plane normal equal to h. Finally, the denominator 4(n · l) (n · v) is a correction factor to correct for the difference in the number going from the local space of the micro-plane to the global surface.
The invention adopts a Bidirectional Reflectance Distribution (BRDF) algorithm to calculate the radiation values of three channels of RGB of each pixel under the current illumination.
Reflection correction and performance optimization are carried out on a traditional bidirectional reflection distribution function, extremely strong highlight parts are corrected to approximate a surface light source, perceived highlight can also appear on some objects with rough inclination angles, cspec is between 0.03 and 0.06, and the value of alpha is very small (between 0.1 and 2.0).
As shown in fig. 2, an embodiment of the present invention further provides a virtual-real fusion system for an oversized virtual scene and a dynamic screen shot, including: a virtual reality head mounted device 100(VR head mounted device), a tracking locator 500, a vision processing computer 300, a video processing computer 200, a 3D depth camera 600, a simulator operator table, and a display screen 400.
Specifically, the virtual reality head-mounted device 100 is connected with the view processing computer 300, the view processing computer 300 is connected with the video processing computer 200, and the 3D depth camera 600 is connected with the video processing computer 200; the simulator operator and display screen 400 are connected to the vision processing computer 300, and the tracking locator 500 is connected to the vision processing computer 300.
In one embodiment of the present invention, the virtual reality head mounted device 100 is connected to the vision processing computer 300 through USB3.0 and HDMI interface, the 3D depth camera 600 is connected to the video processing computer 200 through USB3.0, the vision processing computer 300 and the video processing computer 200 are connected through LAN, and the 3D depth camera 600 is connected to the virtual reality head mounted device 100; the simulator operating instrument and the display screen 400 are connected to the vision processing computer 300 through a USB3.0, and the tracking locator 500 is connected to the vision processing computer 300 through a USB3.0 interface.
In one embodiment of the present invention, the virtual reality head mounted device 100100 may employ an Oculus rise virtual reality device. The 3D depth Camera 600 employs a ZED stereo Camera or Intel real sense SR300 (this Camera is mounted on an Oculus Rift head-mounted device 100 (helmet).
In addition, the foreground extraction system based on virtual and real combined multiple spatial positioning adopts the following three-dimensional engine software: unity or Unity.
The video image that 3D degree of depth camera 600 was gathered includes: color video, depth video, and infrared video.
The tracking locator 500 marks the position of the VR helmet relative to the VR positioning camera in the physical space according to the video image shot by the 3D depth camera 600, and sets the position of the camera in the virtual space, i.e., the position of the VR helmet.
In addition, the tracking locator 500 is used to monitor the head platform position data of the user and send it to the vision processing computer 300.
The view image processing computer is used for extracting a shot video foreground from the video image according to the video image collected by the 3D depth camera 600.
The video image processing computer adopts a parallax masking mapping method, modifies video texture coordinates according to height map data of an extracted video foreground so as to enable an original plane video to generate 3D details, generates a masking diffuse reflection mapping of an equal proportion model in real time, uses the masking diffuse reflection mapping to perform Alpha mixing calculation to extract pixel RGB values of a screen shot area in the foreground, calculates soft shadow coefficients, adopts a multi-GPU highlight correction bidirectional reflection distribution algorithm, and combines the soft shadow coefficients to adjust the illumination intensity of diffuse reflection and specular reflection so as to recalculate the pixel RGB values of all objects in the extracted foreground.
The video image processing computer adopts multi-GPU processing, and comprises: and a multi-process distributed GPU is adopted for processing, and a plurality of processes adopt sockets for communication.
Most modern motherboards support or are connected to the system's CPU through a PCI-Express bus. It would be highly desirable to speed up the rendering process with each GPU available on the system. The new memoryless algorithm is well suited for parallelized implementations, as each iteration can be performed completely independently of any other iteration.
And the multiple processes can communicate by adopting a socket. And different processing hosts are connected by adopting a gigabit network card. The method for adopting the multi-GPU highlight correction bidirectional reflection distribution algorithm comprises the following steps: and (4) correcting the extremely strong highlight by using an approximate surface light source.
Reflection correction and performance optimization are carried out on a traditional bidirectional reflection distribution function, extremely strong highlight parts are corrected to approximate a surface light source, perceived highlight can also appear on some objects with rough inclination angles, cspec is between 0.03 and 0.06, and the value of alpha is very small (between 0.1 and 2.0).
The video image processing computer adopts a multi-GPU highlight correction bidirectional reflection distribution algorithm, and comprises the following steps: and (4) correcting the extremely strong highlight by using an approximate surface light source.
The simulator operator console and display 400 are used to provide dashboard buttons for the user to input operation commands, the operation signals are processed by the vision processing computer 300, and the processed results are presented to the user by the virtual reality head-mounted device 100.
The method and the system for fusing the virtual scene and the real scene of the super-large virtual scene and the dynamic video capture screen of the embodiment of the invention are analyzed as follows:
in the graphics domain, programmable high performance rasterization pipelines have been the core of interactive rendering over the past decade.
Programmers of modern Graphics Processing Units (GPUs) write "shaders" in C-based languages such as HLSL or GLSL to provide code for the inner loops of the pipeline. The programming model of the shader is a data parallel type; the model provides rich parallelism and can be well mapped to the underlying SIMD hardware architecture. Although many other fields besides the graphics field are using programmable GPUs to achieve high performance computing, it is still unknown whether a programmable high performance software system can be built for GPUs.
Optix, a "general purpose ray tracing engine based on NVIDIA, Inc. 2010, is not a fully functional renderer, but rather a framework of both programmability and high performance, following a similar profile as most ray tracing algorithms, enabling a programmable pipeline to operate a user. Focus on the bottom sub-operations of ray tracing and avoid having to render specific structures. This concept allows the general purpose, following the same profile, to solve a number of problems. It can also be used for interactive and off-line friendship algorithms, and other problem solving, such as collision detection, artificial intelligence, and scientific simulation. Because it is architecturally built, it can be used for all architecturally supported systems.
The invention is subjected to basic test by adopting a set of test framework realized by Optix SDK. And respectively selecting scenes in different scenes.
Scene information and parameters for rendering images, as shown in table 1:
TABLE 1
Each iteration emits 5122 and 10242 photons, respectively, and 1003 voxels are implemented using an ordered grid photon map. In direct Fourier estimation, four shadow samples are used pixel by pixel. These measurements do not include any start-up time, nor the time taken to construct the scene, transform the geometry, and build the acceleration structure. Several re-starts of the implant process are required to prevent the bias caused by the "cold start". After 200 iterations of the algorithm, the average of the number of iterations per second (over the entire time interval) is calculated. Tables 2 and 3 below show the absolute and relative performance of the graphics cards, respectively.
TABLE 2
TABLE 3
Under the method provided by the invention, a plurality of GPUs can be used in parallel to carry out accelerated operation.
10242 photons are emitted per iteration, and the algorithm is performed 200 and 1000 times. Table 3 is the speed-up ratio for a multiple GPU system.
TABLE 4
Through tests, the 3GPU can basically meet the real-time frame number requirement required by mediated reality.
According to the virtual-real fusion method and system of the super-large virtual scene and the dynamic screen shot, the following technologies are adopted: the method comprises the steps of detecting shadows on the whole shot video frame based on a model and shadow attributes, eliminating the video shadows in real time based on Gaussian model color consistency, simulating a three-dimensional engine, carrying out illumination calculation and BRDF highlight correction and performance optimization on a three-dimensional model corresponding to a video extraction foreground based on a physical Bidirectional Reflectance Distribution (BRDF) model, achieving the basic requirement of average 60 frames/second for mediating real super-large scenes (100 x 100 square kilometers) under 3 or more GPUs, and achieving the effect of shooting videos and generating the same shadow effect with a virtual view under the effects of directional light sources, point light sources, depth of field, dynamic blur and hard shadows of the fused image frames.
The invention can reach the basic requirement of average 60 frames/second of the mediated reality super large scene (100 × 100 square kilometers) under 3 and more than multiple GPUs, and can fuse the video content of the dynamic screen shot. The fused image frame can achieve the same light and shadow effect of shooting video and virtual visual under the effects of directional light sources, point light sources, depth of field, dynamic blur, hard shadow and the like.
The virtual-real fusion method and system for the oversized virtual scene and the dynamic screen shot in the embodiment of the invention have the following characteristics:
(1) based on a very large virtual landscape, rather than a small indoor scene.
(2) Based on physical rendering, an empirical formula (such as Phong) is pieced together differently than the rendering models used previously.
(3) Different from the mode of acquiring the light parameters of the physical space in the augmented reality and then calculating the shadow of the virtual model, the method simulates real illumination by performing physical rendering according to ray tracing coloring in the virtual scene without designing a separate algorithm to realize the shadow.
(4) The method is suitable for processing the condition that the video content has dynamic pictures of other displays, namely, a shooting screen.
(5) The variable or non-variable regions can be distinguished and processed using different algorithms.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.