CN120355569A

CN120355569A - Virtual-real view fusion method and device, electronic equipment and storage medium

Info

Publication number: CN120355569A
Application number: CN202510804180.7A
Authority: CN
Inventors: 隋文涛; 赵石彬; 韩文斌; 范士海; 骆同伟; 王鹏; 高阁; 刘思超
Original assignee: Nanjing Ruichenxinchuang Network Technology Co ltd
Current assignee: Nanjing Ruichenxinchuang Network Technology Co ltd
Priority date: 2025-06-17
Filing date: 2025-06-17
Publication date: 2025-07-22
Anticipated expiration: 2045-06-17
Also published as: CN120355569B

Abstract

The application discloses a virtual-real view fusion method, a device, electronic equipment and a storage medium, wherein the method comprises the steps of constructing a virtual scene based on a real scene, wherein the virtual scene comprises virtual entities and twins corresponding to the real entities one by one; the method comprises the steps of obtaining pose information of a real entity in real time, synchronizing pose information of a twin body, obtaining pose information of a real viewpoint, obtaining an initial virtual view based on the pose information of the real viewpoint and a virtual scene, removing invisible parts of the virtual entity and the twin body from the initial virtual view to obtain a target virtual view, and obtaining a fusion view based on the target virtual view. The method and the device remove the blocked part of the virtual entity based on the twin body of the real entity, then fuse with the real view, limit the movement of the virtual entity by the real view, effectively ensure the dynamic consistency of the virtual view and the real view, and ensure the position precision of the virtual entity in the fused view and the accuracy of the fused depth information.

Description

Virtual-real view fusion method and device, electronic equipment and storage medium

Technical Field

The application belongs to the technical field of virtual and actual views, and particularly relates to a virtual and actual view fusion method, a virtual and actual view fusion device, electronic equipment and a storage medium.

Background

The virtual-real view fusion refers to the process of fusing a real view and a virtual view to form a fused view and visualizing the fused view. The existing virtual-real view fusion technology directly superimposes the virtual entity into the real view, the motion of the virtual entity is not limited by the real view, the dynamic consistency of the virtual view and the real view cannot be ensured in the fusion process of the virtual view and the real view aiming at the motion view, and the position precision of the virtual entity in the final fusion view and the accuracy of fusion depth information are affected.

Disclosure of Invention

The application aims to provide a virtual-real view fusion method, a virtual-real view fusion device, electronic equipment and a storage medium, which are used for solving the technical problem that in the prior art, a virtual view is directly overlapped into a real view, and the dynamic consistency of the virtual view and the real view cannot be ensured in the process of aiming at the virtual view and the real view of a motion scene.

In order to achieve the above object, a first aspect of the present application provides a method for fusing virtual and real views, including:

constructing a virtual scene based on a real scene, wherein the real scene comprises real entities, and the virtual scene comprises the virtual entities and twins corresponding to the real entities one by one;

acquiring pose information of the real entity in real time, and synchronizing the pose information of the twin body;

Acquiring pose information of a real viewpoint, and acquiring an initial virtual view based on the pose information of the real viewpoint and the virtual scene;

Removing the invisible part of the virtual entity and the twin body from the initial virtual view to obtain a target virtual view;

And obtaining a fusion view based on the target virtual view.

In one or more embodiments, the step of obtaining an initial virtual view based on pose information of the real viewpoint and the virtual scene includes:

synchronizing virtual viewpoints having the same pose information in the virtual scene based on the pose information of the real viewpoints;

and obtaining an initial virtual view based on the pose information of the virtual view point and the virtual scene.

In one or more embodiments, the step of culling the invisible portion of the virtual entity and the twins from the initial virtual view to obtain a target virtual view includes:

Traversing each pixel point of the initial virtual view, and judging whether the traversed current pixel point simultaneously exists the virtual entity and the twin body;

If yes, judging whether the depth value of the virtual entity at the current pixel point is larger than the depth value of the twin body at the current pixel point or not;

If yes, rejecting the current pixel point of the virtual entity;

And after traversing, eliminating the twin body.

In one or more embodiments, the real viewpoint is a depth camera, the step of rejecting the current pixel of the virtual entity is specifically to set texture information and depth information of the current pixel of the virtual entity to be unwritten, the step of rejecting the twin is specifically to set texture information and depth information of the twin to be unwritten, or,

The real viewpoint is a human eye, an optical perspective device is arranged on an observation path of the real viewpoint, the step of eliminating the current pixel point of the virtual entity is specifically that texture information of the current pixel point of the virtual entity is set to be unwritten, and the step of eliminating the twin body is specifically that the texture information of the twin body is set to be unwritten.

In one or more embodiments, the real viewpoint is a depth camera;

based on the target virtual view, the step of obtaining a fusion view comprises the following steps:

Based on the pose information of the real viewpoint and the real scene, obtaining a real view;

and fusing the real vision and the target virtual vision to obtain a fused vision.

In one or more embodiments, the step of fusing the real view and the target virtual view to obtain a fused view includes:

Fusing the depth matrix of the real view and the depth matrix of the target virtual view to obtain a fused depth matrix, wherein the depth matrix is used for describing the depth value of each pixel point,

,

Wherein D ^f represents a fusion depth matrix, D ^r represents a depth matrix of a real view, D ^v represents a depth matrix of a target virtual view, and i and j represent coordinates of pixel points;

Fusing the pixel matrix of the real view and the pixel matrix of the target virtual view to obtain a fused pixel matrix, wherein the pixel matrix is used for describing the color value of each pixel point,

,

Wherein P ^f represents a fused pixel matrix, P ^r represents a pixel matrix of a real view, P ^v represents a pixel matrix of a target virtual view, and i and j represent coordinates of pixel points;

And obtaining a fusion view based on the fusion depth matrix and the fusion pixel matrix.

In one or more embodiments, the real viewpoint is a human eye, and an optical perspective device is arranged on an observation path of the real viewpoint;

And projecting the target virtual view onto the optical perspective equipment so as to enable the target virtual view and the real view to be fused and imaged in human eyes to obtain a fused view.

In order to achieve the above object, a second aspect of the present application provides a fusion device for virtual-real views, including:

The virtual scene construction module is used for constructing a virtual scene based on a real scene, wherein the real scene comprises real entities, and the virtual scene comprises the virtual entities and twins which are in one-to-one correspondence with the real entities;

the synchronization module is used for acquiring the pose information of the real entity in real time and synchronizing the pose information of the twin body;

the virtual view acquisition module is used for acquiring pose information of a real viewpoint and acquiring an initial virtual view based on the pose information of the real viewpoint and the virtual scene;

The virtual view processing module is used for eliminating the invisible part of the virtual entity and the twin body in the initial virtual view to obtain a target virtual view;

and the fusion module is used for obtaining a fusion view based on the target virtual view.

To achieve the above object, a third aspect of the present application provides an electronic device, including:

At least one processor, and

A memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of merging virtual and real views as described in any of the above embodiments.

To achieve the above object, a fourth aspect of the present application provides a machine-readable storage medium storing executable instructions that when executed cause the machine to perform the virtual-to-real view fusion method according to any one of the above embodiments.

Compared with the prior art, the application has the beneficial effects that:

according to the method, the virtual scene is constructed and synchronized based on the real scene, the virtual view is obtained from the virtual scene, the blocked part of the virtual entity is removed based on the twin body of the real entity, and then the virtual entity is fused with the real view, so that the movement of the virtual entity is limited by the real scene, the dynamic consistency of the virtual view and the real view is effectively ensured, and the position precision of the virtual entity in the fused view and the accuracy of the fused depth information are ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.

FIG. 1 is a flow chart of an embodiment of a method for merging virtual and real views according to the present application;

FIG. 2 is a schematic diagram of one embodiment of a virtual scene of the present application;

FIG. 3 is a flowchart of an embodiment corresponding to S300 in FIG. 1;

FIG. 4 is a schematic diagram of an embodiment of an initial virtual view of the present application;

FIG. 5 is a flowchart of an embodiment corresponding to S400 in FIG. 1;

FIG. 6 is a schematic diagram of one embodiment of a target virtual view of the present application;

FIG. 7 is a schematic view of the effect of culling in the present application;

FIG. 8 is a schematic diagram of one embodiment of a fused view of the present application;

FIG. 9 is a schematic diagram of a real view, a virtual view, and a fused view-application scenario of the present application;

FIG. 10 is a schematic structural diagram of an embodiment of a fusion device for virtual-real views according to the present application;

fig. 11 is a schematic structural view of an embodiment of the electronic device of the present application.

Detailed Description

In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

At present, a method for directly superposing a virtual entity to a real view is adopted in a virtual view fusion method, wherein the virtual entity is not constrained by the real view, so that the motion consistency of the virtual entity and the real entity cannot be ensured, and particularly in the virtual-real view fusion of a motion process, the problems of position dislocation, depth dislocation and the like of the real entity and the virtual entity exist, so that the fusion effect is poor.

In order to solve the problems, the applicant develops a novel virtual-real view fusion method, which processes the position relation and shielding relation of a virtual entity and a real entity in a virtual scene, realizes virtual-real fusion based on the processed virtual view, effectively solves the problem of improving the motion consistency of the virtual entity and the real entity, ensures the fusion precision and improves the fusion effect.

Specifically, referring to fig. 1, fig. 1 is a flow chart illustrating an embodiment of a method for merging virtual and real views according to the present application.

As shown in fig. 1, the method includes:

S100, constructing a virtual scene based on the real scene.

The real scene comprises real entities, and the virtual scene comprises virtual entities and twins corresponding to the real entities one by one.

The real scene and the virtual scene are based on a unified coordinate system, and all elements of the real scene can be accurately modeled through a digital twin technology, so that the consistency of the real scene and the virtual scene is ensured.

Specifically, any modeling engine commonly used in the art may be used to construct a virtual scene based on the collected real scene information, such as a illusion engine, etc., which will not be described herein. The user can design the virtual entity based on the fusion requirement and add the virtual entity into the virtual scene, so as to obtain the virtual scene comprising the twin and the virtual entity.

For example, referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of a virtual scene according to the present application, where the virtual scene in fig. 2 includes twin bodies corresponding to real entities one by one, and also includes virtual entities constructed by users.

S200, acquiring pose information of a real entity in real time, and synchronizing the pose information of the twin body.

After the virtual scene is constructed, the twin in the virtual scene needs to be synchronous with the entity moment in the real scene, and the synchronous pose information can comprise position, orientation, pose, action and the like.

The synchronization of the virtual scenes ensures the dynamic consistency of the virtual scenes and the real scenes, and further ensures the dynamic consistency of the virtual entities and the real entities in the subsequent fusion processing.

S300, acquiring pose information of a real viewpoint, and acquiring an initial virtual view based on the pose information of the real viewpoint and the virtual scene.

The real view point refers to a view image acquisition point in a real scene, and by acquiring pose information such as the position, the pose, the visual angle and the focal length of the real view point, the image information which can be acquired in the visual field, namely the view, can be obtained.

In one embodiment, the real viewpoint may be a depth camera, and pose information of the depth camera may be obtained based on sensors disposed on the depth camera and preset parameters of the depth camera.

In another embodiment, the real viewpoint may also be a human eye, and pose information of the human eye may be obtained based on a sensor disposed on the head of the human body.

Based on pose information of a real viewpoint, image information which can be acquired in a visual field, namely an initial virtual scene, can be acquired from the virtual scene.

Specifically, referring to fig. 3, fig. 3 is a flow chart of an embodiment corresponding to S300 in fig. 1.

As shown in fig. 3, the method for acquiring the initial virtual view includes:

s301, synchronizing virtual viewpoints with the same pose information in the virtual scene based on the pose information of the real viewpoints.

S302, obtaining an initial virtual view based on pose information of the virtual view and the virtual scene.

It can be appreciated that by constructing a virtual viewpoint of the same pose information in a virtual scene, the field of view of the virtual viewpoint can be obtained, and thus, the initial virtual viewpoint can be obtained.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating an embodiment of an initial virtual view according to the present application. As shown in fig. 4, a virtual viewpoint is constructed in the virtual scene, and an initial virtual view is acquired based on the view of the virtual viewpoint.

S400, removing the invisible part and the twin body of the virtual entity from the initial virtual view to obtain the target virtual view.

In the embodiment, the position and depth relation of the twin bodies of the virtual entity and the real entity is processed in the initial virtual view, so that the motion consistency of the virtual entity and the real entity is ensured.

Specifically, referring to fig. 5, fig. 5 is a flow chart of an embodiment corresponding to S400 in fig. 1.

As shown in fig. 5, the method for generating the target virtual view includes:

S401, traversing each pixel point of the initial virtual view, and judging whether the traversed current pixel point has a virtual entity and a twin body at the same time.

If yes, then:

s402, judging whether the depth value of the virtual entity at the current pixel point is larger than the depth value of the twin body at the current pixel point.

If yes, then:

S403, eliminating the current pixel point of the virtual entity.

S404, after traversing, removing the twin body.

Based on the steps, the shielded virtual entity can be removed from the virtual view, and then the twin is removed, so that the target virtual view used for fusing with the real view is obtained.

For example, referring to fig. 6, fig. 6 is a schematic diagram of an embodiment of the target virtual view of the present application. As shown in fig. 6, the virtual entity and the twin body blocked in the initial virtual view are removed, so as to obtain the target virtual view.

Specifically, the method of eliminating can be realized by adjusting parameter drawing setting of a simulation engine for objects, specifically, the simulation engine of a virtual scene supports drawing of textures and depths of the objects, if a certain type of objects is set, texture information is not written in the process of rendering, the objects can be presented as black without textures on a final picture, and if the objects are set, the texture information is not written in the process of rendering, the depth information is not written in the process of rendering, and the objects can not be presented on the final picture.

Referring to fig. 7, fig. 7 is a schematic diagram showing the effect of the culling in the present application, wherein a is an original image, b is not written with texture information of grasslands, the grasslands are displayed in black, c is not written with texture information and depth information of the grasslands, and the grasslands are not displayed.

Based on the above principle, the object can be rejected in the virtual scene by rendering control. Specifically, in one embodiment, when the real viewpoint is a depth camera, the finally fused view image needs to be transmitted to the display screen, so that texture information and depth information of the object to be removed can be set to be not written in, at the moment, the object to be removed in the target virtual view is not displayed, and the removed part can be replaced by a corresponding part of the real view during the final fusion, so that the effect of virtual-real fusion is achieved.

In another embodiment, when the real viewpoint is a human eye, the virtual view is required to be projected onto an optical lens device such as a semitransparent lens in front of the human eye, so that texture information of an object to be removed can be set to be unwritten, at the moment, the object to be removed in the target virtual view is displayed to be black, when the object to be removed is projected onto the optical lens device, a corresponding position of the portion does not emit light, and a black portion finally observed by the human eye presents a semitransparent state of the semitransparent lens, and the portion can be replaced by the real view, so that the effect of virtual-real fusion is achieved.

S500, obtaining a fusion view based on the target virtual view.

Because the target virtual view includes visible virtual entities, a fused view can be obtained based on the target virtual view.

In one embodiment, when the true viewpoint is a depth camera, the blended view needs to be output to a display device to achieve blended visualization.

At the moment, the real view can be obtained firstly based on the pose information of the real view point and the real scene, and then the real view and the target virtual view are fused to obtain the fused view.

The method for fusing the real view and the target virtual view can comprise the following steps:

fusing the depth matrix of the real view and the depth matrix of the target virtual view according to the following formula to obtain a fused depth matrix, wherein the depth matrix is used for describing the depth value of each pixel point,

,

fusing the pixel matrix of the real view and the pixel matrix of the target virtual view according to the following formula to obtain a fused pixel matrix, wherein the pixel matrix is used for describing the color value of each pixel point,

,

Referring to fig. 8, fig. 8 is a schematic diagram illustrating an embodiment of the fusion view of the present application. As shown in fig. 8, the real view and the target virtual view are fused to obtain a fused view, and then the fused view is visually presented.

In another embodiment, when the real viewpoint is a human eye, an optical perspective device is arranged on an observation path of the real viewpoint, and a user observes a real scene through the optical perspective device, so that a target virtual view can be directly projected on the optical perspective device, and the real view and the target virtual view can be fused and imaged in the human eye to obtain a fused view when the user observes.

The optical perspective device may be any device capable of displaying an image, such as a semitransparent lens, and the human eye can see through to observe the external environment.

Based on the fusion method of the embodiments, a virtual scene is constructed and synchronized based on a real scene, a virtual view is obtained from the virtual scene, the blocked part of the virtual entity is removed based on a twin body of the real entity, and then the virtual entity is fused with the real view, the movement of the virtual entity is limited by the real scene, the dynamic consistency of the virtual view and the real view is effectively ensured, and the position precision of the virtual entity in the fused view and the accuracy of fusion depth information are ensured.

The virtual-real view fusion method of the embodiments can be applied to simulation training scenes, for example, real training scenes can be mapped through unmanned aerial vehicle oblique photography, site modeling is performed based on mapping data, and virtual scene construction is completed. In the simulation training process, a training entity acquires data such as positions, postures and actions in real time by wearing equipment such as a positioning sensor, a posture sensor and a dynamic capturing sensor, and is used for synchronizing a twin body in a virtual scene in real time.

The reference and training entity acquires real visual data by wearing a camera, and the virtual visual acquisition module completes synchronization of the virtual camera according to the position, the gesture, the visual angle and the focal length in the real visual data and acquires the virtual visual data.

The virtual-real view fusion module obtains fusion view data by fusing pictures in the real view and the virtual view, and outputs the fusion pictures to the head-mounted display terminal, so that the visualization of the fusion view is realized.

For example, referring to fig. 9, fig. 9 is a schematic diagram of a real view, a virtual view, and a fusion view-application scene of the present application. As shown in fig. 9, by fusing the real view and the virtual view, a fused view for simulation training can be obtained.

The application also provides a virtual-real view fusion device, refer to fig. 10, and fig. 10 is a schematic structural diagram of an embodiment of the virtual-real view fusion device of the application. As shown in fig. 10, the apparatus includes a virtual scene construction module 21, a synchronization module 22, a virtual view acquisition module 23, a virtual view processing module 24, and a fusion module 25.

The virtual scene construction module 21 is configured to construct a virtual scene based on a real scene, where the real scene includes a real entity, and the virtual scene includes the virtual entity and a twin corresponding to the real entity one by one;

the synchronization module 22 is used for acquiring pose information of a real entity in real time and synchronizing the pose information of the twin body;

the virtual view acquisition module 23 is configured to acquire pose information of a real viewpoint, and obtain an initial virtual view based on the pose information of the real viewpoint and the virtual scene;

The virtual view processing module 24 is configured to reject the invisible part and the twin of the virtual entity from the initial virtual view, so as to obtain a target virtual view;

The fusion module 25 is configured to obtain a fusion view based on the target virtual view.

As described above with reference to fig. 1 to 9, a method of merging virtual and real views according to an embodiment of the present specification is described. The details mentioned in the description of the method embodiment above are equally applicable to the virtual-real view fusion device of the embodiments of the present specification. The above virtual-real view fusion device can be implemented by hardware, or can be implemented by software or a combination of hardware and software.

The application also provides an electronic device, refer to fig. 11, and fig. 11 is a schematic structural diagram of an embodiment of the electronic device. As shown in fig. 11, the electronic device 30 may include at least one processor 31, a memory 32 (e.g., a non-volatile memory), a memory 33, and a communication interface 34, and the at least one processor 31, the memory 32, the memory 33, and the communication interface 34 are connected together via a bus 35. The at least one processor 31 executes at least one computer readable instruction stored or encoded in the memory 32.

It should be understood that the computer-executable instructions stored in the memory 32, when executed, cause the at least one processor 31 to perform the various operations and functions described above in connection with fig. 1-9 in various embodiments of the present description.

In embodiments of the present description, electronic device 30 may include, but is not limited to, a personal computer, a server computer, a workstation, a desktop computer, a laptop computer, a notebook computer, a mobile electronic device, a smart phone, a tablet computer, a cellular phone, a Personal Digital Assistant (PDA), a handheld device, a messaging device, a wearable electronic device, a consumer electronic device, and the like.

According to one embodiment, a program product, such as a machine-readable medium, is provided. The machine-readable medium may have instructions (i.e., elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-9 in various embodiments of the specification. In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium may implement the functions of any of the above embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present specification.

Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or cloud by a communications network.

It will be appreciated by those skilled in the art that various changes and modifications can be made to the embodiments disclosed above without departing from the spirit of the invention. Accordingly, the scope of protection of this specification should be limited by the attached claims.

It should be noted that not all the steps and units in the above flowcharts and the system configuration diagrams are necessary, and some steps or units may be omitted according to actual needs. The order of execution of the steps is not fixed and may be determined as desired. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical client, or some units may be implemented by multiple physical clients, or may be implemented jointly by some components in multiple independent devices.

In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may include permanently dedicated circuitry or logic (e.g., a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware unit or processor may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The particular implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

The detailed description set forth above in connection with the appended drawings describes exemplary embodiments, but does not represent all embodiments that may be implemented or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for fusing virtual and real scenes, comprising:

Based on a real scene, construct a virtual scene, wherein the real scene includes a real entity, and the virtual scene includes a virtual entity and a twin corresponding to the real entity one by one;

Acquire the posture information of the real entity in real time and synchronize the posture information of the twin;

Acquiring position and posture information of a real viewpoint, and obtaining an initial virtual view based on the position and posture information of the real viewpoint and the virtual scene;

Eliminating the invisible part of the virtual entity and the twin in the initial virtual vision to obtain a target virtual vision;

Based on the target virtual vision, a fused vision is obtained.

2. The fusion method according to claim 1, characterized in that the step of obtaining the initial virtual view based on the pose information of the real viewpoint and the virtual scene comprises:

Based on the position information of the real viewpoint, synchronizing a virtual viewpoint having the same position information in the virtual scene;

An initial virtual view is obtained based on the position information of the virtual viewpoint and the virtual scene.

3. The fusion method according to claim 1, characterized in that the step of removing the invisible part of the virtual entity and the twin in the initial virtual vision to obtain the target virtual vision comprises:

Traversing each pixel point of the initial virtual scene, and determining whether the virtual entity and the twin exist at the current pixel point traversed;

If so, determine whether the depth value of the virtual entity at the current pixel is greater than the depth value of the twin at the current pixel;

If so, remove the current pixel of the virtual entity;

After the traversal is completed, the twins are eliminated.

4. The fusion method according to claim 3 is characterized in that the real viewpoint is a depth camera, the step of removing the current pixel of the virtual entity is specifically: setting the texture information and depth information of the current pixel of the virtual entity to not be written, and the step of removing the twin is specifically: setting the texture information and depth information of the twin to not be written; or,

The real viewpoint is the human eye, and an optical perspective device is provided on the observation path of the real viewpoint. The step of eliminating the current pixel point of the virtual entity is specifically: setting the texture information of the current pixel point of the virtual entity to not be written, and the step of eliminating the twin is specifically: setting the texture information of the twin to not be written.

5. The fusion method according to claim 1, characterized in that the real viewpoint is a depth camera;

Based on the target virtual vision, the step of obtaining a fused vision comprises:

Based on the position information of the real viewpoint and the real scene, a real view is obtained;

The real vision and the target virtual vision are fused to obtain a fused vision.

6. The fusion method according to claim 5, characterized in that the step of fusing the real vision and the target virtual vision to obtain a fused vision comprises:

The depth matrix of the real view and the depth matrix of the target virtual view are fused according to the following formula to obtain a fused depth matrix, where the depth matrix is used to describe the depth value of each pixel.

,

Where ^Df represents the fusion depth matrix, ^Dr represents the depth matrix of the real view, ^Dv represents the depth matrix of the target virtual view, and i and j represent the coordinates of the pixel points;

The pixel matrix of the real scene and the pixel matrix of the target virtual scene are fused according to the following formula to obtain a fused pixel matrix, where the pixel matrix is used to describe the color value of each pixel point:

,

Where ^Pf represents the fused pixel matrix, ^Pr represents the pixel matrix of the real scene, ^Pv represents the pixel matrix of the target virtual scene, and i and j represent the coordinates of the pixel points;

A fused scene is obtained based on the fused depth matrix and the fused pixel matrix.

7. The fusion method according to claim 1, characterized in that the real viewpoint is a human eye, and an optical perspective device is provided on the observation path of the real viewpoint;

The target virtual vision is projected onto the optical perspective device, so that the target virtual vision and the real vision are fused into an image in the human eye to obtain a fused vision.

8. A virtual and real scene fusion device, characterized by comprising:

A virtual scene construction module, used to construct a virtual scene based on a real scene, wherein the real scene includes a real entity, and the virtual scene includes a virtual entity and a twin corresponding to the real entity one by one;

A synchronization module, used to obtain the posture information of the real entity in real time and synchronize the posture information of the twin;

A virtual view acquisition module, used to acquire the position and posture information of the real viewpoint, and obtain an initial virtual view based on the position and posture information of the real viewpoint and the virtual scene;

A virtual vision processing module, used to remove the invisible part of the virtual entity and the twin in the initial virtual vision to obtain a target virtual vision;

The fusion module is used to obtain a fused view based on the target virtual view.

9. An electronic device, comprising:

at least one processor; and

A memory storing instructions, wherein when the instructions are executed by the at least one processor, the at least one processor executes the method for fusing virtual and real scenes as described in any one of claims 1 to 7.

10. A machine-readable storage medium, characterized in that executable instructions are stored therein, and when the instructions are executed, the machine executes the method for fusing virtual and real scenes as described in any one of claims 1 to 7.