CN113724309B

CN113724309B - Image generation method, device, equipment and storage medium

Info

Publication number: CN113724309B
Application number: CN202110996416.3A
Authority: CN
Inventors: 林耀冬; 张欣; 陈杰
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2024-06-14
Anticipated expiration: 2041-08-27
Also published as: CN113724309A

Abstract

The embodiment of the application discloses an image generation method, an image generation device, image generation equipment and a storage medium, and belongs to the field of computer graphics. The method comprises the following steps: coordinates of a plurality of visual points of the three-dimensional virtual scene are determined. And taking each pixel point in the target projection image as a starting point, and transmitting virtual photons to the three-dimensional virtual scene, wherein the target projection image is a projection image corresponding to the scene to be simulated currently. The brightness of each of the plurality of visual points is determined based on the optical energy of the virtual photons around each of the plurality of visual points. And generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visual points. According to the embodiment of the application, the image corresponding to the three-dimensional virtual scene is automatically generated by the image generation method, and the physical scene and the physical equipment are not needed, so that the cost of image generation is reduced, and the efficiency of image generation is improved.

Description

Image generation method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the field of computer graphics, in particular to an image generation method, an image generation device and a storage medium.

Background

In recent years, image generation technology has been rapidly developed, and applications of the image generation technology are becoming more and more widespread, such as being widely applied in a plurality of fields of scientific research, industrial production, medical care, and the like, so that demands of users on the image generation technology are also becoming more and more high.

Currently, images are generated by photographing a real scene using a professional imaging apparatus. However, professional imaging apparatuses are expensive, and in some scenes, a special shooting scene needs to be manually built, which is costly and inefficient, so that an economical and efficient image generation method is needed.

Disclosure of Invention

The embodiment of the application provides an image generation method, an image generation device and a storage medium, which can solve the image generation problem of the related technology. The technical scheme is as follows:

in one aspect, there is provided an image generation method, the method including:

Determining coordinates of a plurality of visible points of a three-dimensional virtual scene, wherein the plurality of visible points are points corresponding to a plurality of pixel points on an image plane of a virtual camera in the three-dimensional virtual scene;

taking each pixel point in a target projection image as a starting point, and transmitting virtual photons to the three-dimensional virtual scene, wherein the target projection image is a projection image corresponding to the scene to be simulated currently;

Determining a brightness of each of the plurality of visual points based on light energy of virtual photons around each of the plurality of visual points;

And generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visual points.

Optionally, the determining coordinates of the multiple visual points of the three-dimensional virtual scene includes:

and determining the coordinates of the plurality of visible points in an inverse ray tracing mode.

Optionally, the determining coordinates of the multiple visible points according to the inverse ray tracing mode includes:

Determining a plurality of rays which are rays led out from each pixel point on an image plane of the virtual camera and pass through an optical center of the virtual camera;

and determining coordinates of a plurality of visible points by taking the points where the plurality of rays intersect with the object surface in the three-dimensional virtual scene as the plurality of visible points.

Optionally, the determining the brightness of each of the plurality of visual points based on the optical energy of the virtual photons around each of the plurality of visual points includes:

Determining a first optical energy for each of the plurality of visual points based on optical energy of virtual photons around each of the plurality of visual points;

And determining the brightness of each visual point in the visual points based on the first light energy of each visual point in the visual points, wherein the brightness is the light energy carried by the light reflected by the corresponding visual point when reaching the optical center of the virtual camera.

Optionally, the determining the first light energy of each of the plurality of visual points based on the light energy of the virtual photons around each of the plurality of visual points includes:

Determining light energy of each of the plurality of visual points at a plurality of moments that are a plurality of different moments that emit virtual photons to the three-dimensional virtual scene based on light energy of virtual photons emitted at the plurality of moments to surroundings of each of the plurality of visual points;

and determining an average value of the light energy of each of the plurality of visual points at the plurality of moments as a first light energy of the corresponding visual point of the plurality of visual points.

Optionally, the determining the light energy of each of the plurality of visual points at the plurality of moments based on the light energy of the virtual photons emitted at the plurality of moments to the surrounding of each of the plurality of visual points includes:

Selecting one time from the multiple times as a target time, and determining the light energy of each of the multiple visual points at the target time according to the following operation until the light energy of each of the multiple visual points at each time is determined:

Determining coordinates of the object surface where the virtual photons emitted at the target moment reside in the three-dimensional virtual scene, so as to obtain coordinates of a plurality of virtual photons;

Determining the optical energy of each virtual photon in the plurality of virtual photons;

The optical energy of each of the plurality of visual points at the target time is determined based on the coordinates of the plurality of visual points, the coordinates of the plurality of virtual photons, and the optical energy of the virtual photons of the plurality of virtual photons that are located around each of the plurality of visual points.

Optionally, the determining the light energy of each virtual photon in the plurality of virtual photons includes:

determining a pixel value of a corresponding pixel point of each virtual photon in the target projection image in a forward ray tracing mode;

The light energy of each of the plurality of virtual photons is determined based on the pixel value of the corresponding pixel point in the target projected image for each of the plurality of virtual photons and the light energy of the virtual light source for emitting the plurality of virtual photons.

Optionally, the determining the optical energy of each of the plurality of visible points at the target time based on the coordinates of the plurality of visible points, the coordinates of the plurality of virtual photons, and the optical energy of the virtual photons located around each of the plurality of visible points in the plurality of virtual photons includes:

Selecting one of the plurality of visual points, and determining the light energy of the selected visual point at the target moment according to the following operation until the light energy of each visual point at the target moment is determined:

Determining virtual photons within a specified range based on the coordinates of the selected visual point and the coordinates of the plurality of virtual photons, wherein the specified range is a sphere range with the selected visual point as a sphere center and a specified numerical value as a radius;

And determining the sum of the optical energy of the virtual photons in the designated range as the optical energy of the selected visible point at the target moment.

Optionally, the image corresponding to the three-dimensional virtual scene includes a binocular image, and the coordinates of the multiple visual points are coordinates of the multiple visual points in a world coordinate system of the three-dimensional virtual scene;

The generating the image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visual points further includes:

Converting coordinates of each of the plurality of visual points in the world coordinate system to coordinates in a camera coordinate system of the virtual camera;

and correspondingly storing coordinates of the binocular image and the plurality of visual points in the camera coordinate system.

Optionally, the method further comprises:

Taking the stored multiple binocular images as input of a neural network model to be trained, taking vertical coordinates of multiple visual points corresponding to each binocular image in the multiple binocular images in a camera coordinate system of the virtual camera as output of the neural network model to be trained, and training the neural network model to be trained;

wherein the plurality of binocular images are images generated based on the three-dimensional virtual scene.

Optionally, the image corresponding to the three-dimensional virtual scene comprises a binocular image;

generating a disparity map based on the binocular image;

and correspondingly storing the binocular image and the disparity map.

Optionally, the method further comprises:

Converting each parallax map in the stored plurality of parallax maps into a depth map to obtain a plurality of depth maps;

Taking the stored multiple binocular images as input of a neural network model to be trained, taking a depth map obtained by converting a parallax map corresponding to each binocular image in the multiple binocular images as output of the neural network model to be trained, and training the neural network model to be trained;

In another aspect, there is provided an image generating apparatus, the apparatus including:

the first determining module is used for determining coordinates of a plurality of visible points of the three-dimensional virtual scene, wherein the visible points are points corresponding to a plurality of pixel points on an image plane of the virtual camera in the three-dimensional virtual scene;

The emission module is used for emitting virtual photons to the three-dimensional virtual scene by taking each pixel point in the target projection image as a starting point, wherein the target projection image is a projection image corresponding to the scene to be simulated currently;

A second determining module for determining a brightness of each of the plurality of visual points based on light energy of virtual photons around each of the plurality of visual points;

And the first generation module is used for generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visual points.

Optionally, the first determining module includes:

and the first determining submodule is used for determining the coordinates of the plurality of visual points in a reverse ray tracing mode.

Optionally, the first determining submodule includes:

a first determining unit configured to determine a plurality of rays, the plurality of rays being rays that are extracted from each pixel point on an image plane of the virtual camera and pass through an optical center of the virtual camera;

and the second determining unit is used for determining coordinates of a plurality of visible points by taking the points, where the plurality of rays intersect with the object surface in the three-dimensional virtual scene, as the plurality of visible points.

Optionally, the second determining module includes:

a second determination sub-module for determining a first light energy of each of the plurality of visual points based on light energy of virtual photons around each of the plurality of visual points;

And the third determining submodule is used for determining the brightness of each visual point in the plurality of visual points based on the first light energy of each visual point in the plurality of visual points, wherein the brightness is the light energy carried by the light rays reflected by the corresponding visual point when reaching the optical center of the virtual camera.

Optionally, the second determining submodule includes:

A third determining unit configured to determine light energy of each of a plurality of visible points at a plurality of times, based on light energy of virtual photons emitted to surroundings of each of the plurality of visible points at the plurality of times, the plurality of times being a plurality of different times at which virtual photons are emitted to the three-dimensional virtual scene;

And a fourth determining unit configured to determine an average value of optical energy of each of the plurality of visible points at the plurality of times as a first optical energy of a corresponding one of the plurality of visible points.

Optionally, the third determining unit is specifically configured to:

The apparatus further comprises:

A first conversion module for converting coordinates of each of the plurality of visual points in the world coordinate system into coordinates in a camera coordinate system of the virtual camera;

And the first storage module is used for correspondingly storing the binocular image and the coordinates of the plurality of visual points in the camera coordinate system.

Optionally, the apparatus further includes:

The first training module is used for taking the stored multiple binocular images as input of a neural network model to be trained, taking vertical coordinates of multiple visual points corresponding to each binocular image in the multiple binocular images in a camera coordinate system of the virtual camera as output of the neural network model to be trained, and training the neural network model to be trained;

Optionally, the image corresponding to the three-dimensional virtual scene comprises a binocular image; the apparatus further comprises:

the second generation module is used for generating a parallax image based on the binocular image;

and the second storage module is used for correspondingly storing the binocular image and the parallax map.

Optionally, the apparatus further includes:

The second conversion module is used for converting each parallax image in the stored plurality of parallax images into a depth image to obtain a plurality of depth images;

The second training module is used for taking the stored multiple binocular images as input of a neural network model to be trained, taking a depth image obtained by converting a parallax image corresponding to each binocular image in the multiple binocular images as output of the neural network model to be trained, and training the neural network model to be trained;

In another aspect, a computer device is provided, the computer device including a memory for storing a computer program and a processor for executing the computer program stored on the memory to implement the steps of the image generation method described above.

In another aspect, a computer readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the image generation method described above.

In another aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the image generation method described above.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

According to the embodiment of the application, the multiple visual points of the three-dimensional virtual machine scene and the brightness of the multiple visual points are determined, and the image corresponding to the three-dimensional virtual scene is generated based on the brightness of the multiple visual points, so that the generated image effect is real, the process of generating the image is automatic, the cost of generating the image is reduced, and the efficiency of generating the image is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of an image generating method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of determining a perspective of a three-dimensional virtual scene according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a three-dimensional virtual scene provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of projecting virtual photons into a three-dimensional virtual scene according to an embodiment of the present application;

FIG. 5 is a schematic illustration of a plurality of projected images provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a first light energy for determining a viewable point provided by an embodiment of the application;

FIG. 7 is a schematic diagram of a speckle pattern generated using an image generation method according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a real speckle pattern provided by an embodiment of the application;

FIG. 9 is a schematic diagram of a binocular image provided by an embodiment of the present application;

Fig. 10 is a schematic diagram of a parallax map according to an embodiment of the present application;

fig. 11 is a schematic structural view of an image generating apparatus according to an embodiment of the present application;

Fig. 12 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.

The execution subject of the image generation method provided by the embodiment of the application can be a computer device. The computer device may be a single computer device or may be a computer cluster composed of a plurality of computer devices.

The computer device may be any electronic product that can perform man-machine interaction with a user through one or more modes of a keyboard, a touchpad, a touch screen, a remote controller, a voice interaction or a handwriting device, for example, a PC (Personal Computer ), a palm computer PPC (Pocket PC), a tablet computer, and the like.

It will be appreciated by those skilled in the art that the foregoing computer devices are merely exemplary, and that other computer devices now known or hereafter developed, as applicable to and within the scope of the embodiments of the present application, are incorporated herein by reference.

The image generation method provided by the embodiment of the application is explained in detail below.

Fig. 1 is a flowchart of an image generating method according to an embodiment of the present application, where the method is applied to a computer device. Referring to fig. 1, the method includes the following steps.

S101, determining coordinates of a plurality of visible points of a three-dimensional virtual scene, wherein the visible points are points corresponding to a plurality of pixel points on an image plane of a virtual camera in the three-dimensional virtual scene.

In some embodiments, the coordinates of the plurality of visible points may be determined in an inverse ray tracing manner.

The implementation process for determining the coordinates of the plurality of visible points according to the mode of inverse ray tracing comprises the following steps: a plurality of rays are determined, the plurality of rays being rays that exit from each pixel point on an image plane of the virtual camera and pass through an optical center of the virtual camera. And determining coordinates of the plurality of visible points by taking a plurality of points, at which the plurality of rays intersect with the object surface in the three-dimensional virtual scene, as the plurality of visible points.

As known from the principle of pinhole imaging, when a camera is used to capture an object in a three-dimensional scene, a plurality of points on the surface of the object in the three-dimensional scene reflect light to the optical center of the camera, and the light passes through the optical center of the camera to reach corresponding pixel points on the image plane of the camera for imaging. Similarly, when a virtual camera is used to capture an object in a three-dimensional virtual scene, a plurality of points on the surface of the object in the three-dimensional virtual scene reflect light to the optical center of the virtual camera, and the light reaches the corresponding pixel points on the image plane of the virtual camera through the optical center of the virtual camera to form an image, and the plurality of points on the surface of the object are a plurality of visible points of the three-dimensional virtual scene. Therefore, in order to determine a plurality of visible points of the three-dimensional virtual scene, a method of inverse ray tracing, that is, a ray passing through the optical center of the virtual camera is extracted from each pixel point on the image plane of the virtual camera, and a plurality of points where the plurality of rays intersect with the object surface in the three-dimensional virtual scene are determined as the plurality of visible points, may be used.

Since there may be a plurality of objects in the three-dimensional virtual scene, the materials of the objects are different, that is, the materials of the surfaces of the objects intersected by the plurality of rays are different, so that the method for determining the visible point is also different. The object surface that intersects the plurality of rays may be, for example, a rough surface or a specular surface. These two cases will be described below.

In the first case, the surface of the object is a rough surface, and when any one of the plurality of rays intersects the rough surface in the three-dimensional virtual scene, the point of intersection is determined as a visible point of the three-dimensional virtual scene.

In a second case, the surface of the object is a mirror surface, and after any one of the plurality of rays intersects the mirror surface in the three-dimensional virtual scene, a reflection line of the any one ray at the intersection point of the mirror surface is determined. And after the reflection line is intersected with the rough surface in the three-dimensional virtual scene, determining the point at which the reflection line is intersected with the rough surface as a visible point of the three-dimensional virtual scene.

For example, as shown in fig. 2, the object surface in the three-dimensional virtual scene includes a rough surface and a mirror surface. Two rays are led out from a plurality of pixel points on the image plane of the virtual camera to the optical center of the virtual camera. The first ray intersects the rough surface of the object in the three-dimensional virtual scene, and thus a point at which the first ray intersects the rough surface of the object in the three-dimensional virtual scene is determined as a visible point a of the three-dimensional virtual scene. The second ray intersects the mirror surface of the object in the three-dimensional virtual scene, and therefore it is necessary to continue to determine the reflection line of the second ray at the point of intersection with the mirror surface of the object in the three-dimensional virtual scene. The reflection line intersects the rough surface of the object in the three-dimensional virtual scene, so that the intersection point of the reflection line and the rough surface of the object in the three-dimensional virtual scene is determined as another visible point b of the three-dimensional virtual scene.

After the multiple visual points of the three-dimensional virtual scene are determined according to the method, a world coordinate system of the three-dimensional virtual scene can be established, and then the coordinates of each visual point in the multiple visual points are determined based on the world coordinate system of the three-dimensional virtual scene. That is, the coordinates of the plurality of visual points are coordinates of the plurality of visual points in the world coordinate system of the three-dimensional virtual scene. In the embodiment of the present application, any point in the three-dimensional virtual scene may be used as an origin, and the world coordinate system of the three-dimensional virtual scene may be established with the coordinate axes of horizontal rightward, vertical downward and vertical upward.

It should be noted that, before determining the coordinates of the multiple visual points of the three-dimensional virtual scene, the three-dimensional virtual scene may also be made. The method for manufacturing the three-dimensional virtual scene comprises the following steps of: based on the three-dimensional entity scene, three-dimensional model data are obtained. The three-dimensional model data is input to a three-dimensional scene editor or a physical simulation tool. And automatically building a corresponding three-dimensional virtual scene by using a three-dimensional scene editor or a physical simulation tool. That is, the three-dimensional virtual scene is a three-dimensional virtual scene automatically built by using the three-dimensional model data of the three-dimensional physical scene as input of the three-dimensional scene editor or the physical simulation tool.

For example, when three-dimensional virtual scenes corresponding to a plurality of packages stacked in an express website are manufactured, three-dimensional model data of the packages stacked in the plurality of packages can be obtained, the three-dimensional model data are input into a three-dimensional scene editor or a physical simulation tool, and the three-dimensional virtual scenes corresponding to the packages stacked in the plurality of packages are automatically built by using the three-dimensional scene editor or the physical simulation tool, so that the three-dimensional virtual scene shown in fig. 3 is obtained.

It should be noted that the virtual camera may be a monocular camera or a binocular camera, and the virtual camera may be deployed at any position in the three-dimensional virtual scene according to the needs of the user.

S102, taking each pixel point in the target projection image as a starting point, and transmitting virtual photons to the three-dimensional virtual scene, wherein the target projection image is a projection image corresponding to the scene to be simulated currently.

The target projection image includes a plurality of pixel points, and when the target projection image is illuminated with the virtual light source, a plurality of virtual photons emitted by the virtual light source are emitted to the three-dimensional virtual scene through each of the plurality of pixel points. In the process of emitting virtual photons, the emitting direction of each virtual photon in the plurality of virtual photons is random, and after the plurality of virtual photons are emitted to the three-dimensional virtual scene, the plurality of virtual photons can reside on the surface of an object in the three-dimensional virtual scene.

As shown in fig. 4, a portion of the plurality of virtual photons emitted toward the three-dimensional virtual scene by the target projection image resides at a viewable point a of the three-dimensional virtual scene. Another portion of the virtual photons resides at a non-visible point c of the three-dimensional virtual scene. After some of the virtual photons reach the visual point b of the three-dimensional virtual scene, they are reflected off the three-dimensional virtual scene.

The target projection image is a projection image corresponding to a scene currently required to be simulated, which is selected from a plurality of stored projection images. Each of the plurality of projection images has a different projection pattern to enable simulation of different illumination scenes, the plurality of projection images being capable of simulating any illumination scene.

For example, as shown in FIG. 5, the plurality of projected images includes a structured light fringe pattern, an infrared staggered dot matrix speckle pattern, and an infrared random dot matrix speckle pattern. The structured light fringe pattern is used for simulating a scene irradiated by structured light, the infrared staggered lattice speckle pattern is used for simulating a scene irradiated by infrared light, and the infrared random lattice speckle pattern is used for simulating a scene irradiated by infrared light. Of course, the plurality of projection images may also include simulated images of other illumination scenes, which are not limited herein.

S103, determining the brightness of each of the plurality of visual points based on the optical energy of virtual photons around each of the plurality of visual points.

In some embodiments, determining the brightness of each of the plurality of visual points based on the light energy of virtual photons around each of the plurality of visual points comprises steps (1) - (2) as follows:

(1) A first optical energy of each of the plurality of visual points is determined based on optical energy of virtual photons around each of the plurality of visual points.

In some embodiments, the light energy of each of the plurality of visual points at a plurality of moments that are different moments when virtual photons are emitted to the three-dimensional virtual scene may be determined based on the light energy of virtual photons emitted at the plurality of moments to the surroundings of each of the plurality of visual points. And determining an average value of the light energy of each of the plurality of visual points at the plurality of moments as a first light energy of the corresponding visual point of the plurality of visual points.

Since the virtual light source may emit virtual photons to the three-dimensional virtual scene without interruption through the target projected image, the plurality of moments may be a plurality of different moments at which the virtual light source emits virtual photons to the three-dimensional virtual scene. In addition, since the direction of the virtual photons passing through each pixel point is random, the number of virtual photons residing near each of the plurality of visual points is different, so that the light energy obtained at the plurality of times by each of the plurality of visual points is different, and thus in some embodiments, the average of the light energy at the plurality of times by each of the plurality of visual points may be determined as the light energy of the corresponding one of the plurality of visual points, i.e., the first light energy. The plurality of times may be a plurality of times having the same interval, or a plurality of times having different intervals.

Since the implementation process of determining the optical energy of each of the plurality of visible points at a plurality of moments is the same, one moment may be selected from the plurality of moments as a target moment, and the optical energy of each of the plurality of visible points at the target moment is determined according to the following operations until the optical energy of each of the plurality of visible points at each moment is determined: and determining coordinates of the object surface where the virtual photons emitted at the target moment reside in the three-dimensional virtual scene so as to obtain coordinates of a plurality of virtual photons. The optical energy of each virtual photon in the plurality of virtual photons is determined. The optical energy of each of the plurality of visual points at the target time is determined based on the coordinates of the plurality of visual points, the coordinates of the plurality of virtual photons, and the optical energy of the virtual photons located around each of the plurality of visual points in the plurality of virtual photons.

Based on the above, each pixel point in the target projection image is taken as a starting point, after virtual photons are emitted to the three-dimensional virtual scene, some virtual photons may reside on the surface of an object in the three-dimensional virtual scene, so when a plurality of virtual photons emitted at the target moment reside on the surface of the object in the three-dimensional virtual scene, in the embodiment of the application, the coordinates of the virtual photons emitted at the target moment can be determined based on the three-dimensional coordinate system of the three-dimensional virtual scene, so as to obtain the coordinates of the plurality of virtual photons. Illustratively, the coordinates of the plurality of virtual photons are determined in a forward ray tracing manner, and the pixel value of a corresponding pixel point in the target projection image for each of the plurality of virtual photons is determined. The light energy of each virtual photon of the plurality of virtual photons is determined based on the pixel value of the corresponding pixel point of each virtual photon of the plurality of virtual photons in the target projected image and the light energy of the virtual light source used to emit the plurality of virtual photons.

That is, for each pixel in the target projection image, a plurality of rays are directed from the virtual light source and pass through the pixel, the plurality of rays pointing in different directions. For any one of the plurality of rays, if the ray intersects an object surface in the three-dimensional virtual scene, the coordinates of the intersection point are determined as the coordinates of the virtual photon emanating from the pixel point, and if the ray does not intersect an object surface in the three-dimensional virtual scene, the ray is discarded. In this way, the coordinates of the plurality of virtual photons can be determined, and a corresponding pixel point of each virtual photon in the target projection image, i.e., from which pixel point in the target projection image the virtual photon emanates, can also be determined. Then, the light energy of each virtual photon can be determined based on the pixel value of the corresponding pixel point of each virtual photon in the target projection image and the light energy of the virtual light source.

For example, after determining a corresponding pixel point of each virtual photon in the target projection image, a correspondence between coordinates of the virtual photon and pixel values of the pixel point may be stored.

Since the virtual photons are emitted by the virtual light source through the pixels in the target projected image, the optical energy of the virtual photons is related to the optical energy of the virtual light source and the pixel values of the pixels in the target projected image. Moreover, based on the above description, not only the coordinate of each virtual photon but also the correspondence between the coordinate of the virtual photon and the pixel value of the pixel point can be determined in a forward ray tracing manner. In this way, the pixel value of the pixel point corresponding to each of the plurality of virtual photons in the target projection image can be determined from the correspondence between the coordinates of the plurality of virtual photons and the pixel value of the pixel point according to the coordinates of the plurality of virtual photons. And then, determining the product of the pixel value of the corresponding pixel point and the light energy of the virtual light source as the light energy of the corresponding virtual photon.

Since the light energy of each of the visible points at the target time is determined in the same manner, one visible point may be selected from the plurality of visible points, and the light energy of the selected visible point at the target time may be determined according to the following operations until the light energy of each of the visible points at the target time is determined: based on the coordinates of the selected visual point and the coordinates of the plurality of virtual photons, virtual photons are determined that lie within a specified range of spheres having the selected visual point as a center of sphere and the specified value as a radius. The sum of the optical energy of the virtual photons within the specified range is determined as the optical energy of the selected visual point at the target instant.

That is, a specified range is determined based on the coordinates of the selected visual point, and the coordinates of the plurality of virtual photons are compared with the specified range, thereby determining virtual photons within the specified range from the plurality of virtual photons. The sum of the optical energy of each virtual photon lying within the specified range may then be determined as the optical energy of the selected visual point at the target instant.

The method is that the light energy of all virtual photons which reside on the surface of an object in the three-dimensional virtual scene is determined, then the virtual photons which are positioned in a specified range are determined from the virtual photons, and the light energy of the selected visible point at the target moment is further determined. Alternatively, in other embodiments, the virtual photons within the specified range may be determined first, then the optical energy of each virtual photon within the specified range is determined, and then the sum of the optical energy of the virtual photons within the specified range is determined as the optical energy of the selected visual point at the target time. Thus, the calculated amount of light energy for determining the virtual photons can be reduced, and the image generation efficiency can be improved.

Since each pixel point on the image plane of the virtual camera has a certain photosensitive range, light energy in a certain range from the surface of an object in the three-dimensional virtual scene can be received, for each of the plurality of visible points, the sum of the light energy of the virtual photons in the designated range of the visible point can be determined, and the sum of the light energy of each virtual photon in the designated range can be determined as the light energy of the corresponding visible point at the target moment.

As shown in fig. 6, one of the pixel points on the image plane of the virtual camera may receive light energy from the specified range of the visible point a (solid lines and broken lines in fig. 6 represent light energy emitted from virtual photons within the specified range of the visible point to the corresponding pixel point), and thus the sum of the light energy of the virtual photons within the specified range of the visible point a is determined as the light energy of the visible point a at the target time. Another pixel point on the image plane of the virtual camera may receive reflected light energy from the mirror surface, which is obtained by reflection of light energy in the specified range of the visible point b by the mirror surface, and thus the sum of light energy of virtual photons located in the specified range of the visible point b is determined as the light energy of the visible point b at the target time.

(2) Based on the first light energy of each of the plurality of visual points, determining the brightness of the corresponding visual point of the plurality of visual points, wherein the brightness is the light energy carried by the light reflected by the corresponding visual point when reaching the optical center of the virtual camera.

As known from the principle of pinhole imaging, in the imaging process, light reflected by a plurality of points on the surface of an object in a three-dimensional scene reaches a corresponding pixel point on the image plane of the camera through the optical center of the camera, and the brightness of each point is determined by the light energy carried by the light reflected by each point on the surface of the object. Similarly, in the imaging process, the light reflected by each of the multiple visual points of the three-dimensional virtual scene reaches the corresponding pixel point on the image plane of the virtual camera through the optical center of the virtual camera, and the light energy carried by the light emitted by each of the multiple visual points determines the brightness of the visual point. Thus, the luminance of each of the plurality of visual points may be determined based on the first light energy of the respective visual point.

Based on the foregoing description, the first light energy of each of the plurality of viewable points is determined by a virtual photon emitted by a virtual light source, the illumination of the viewable point by the virtual light source being referred to as indirect illumination. In some cases, however, the multiple visual points may also have self-luminescence and/or direct illumination, such as when the object at which a certain visual point is located is a luminescent object. When a light source directly irradiates a certain visual point, the visual point has direct illumination. Thus, there is also a need to determine a second light energy and a third light energy for each of the plurality of viewable points, the second light energy being self-luminous light energy for the viewable point, the third light energy being light energy for direct illumination of the viewable point. Thereafter, the brightness of each of the plurality of visual points is determined based on the first, second, and third light energies of the respective visual points.

Based on whether each of the plurality of visual points has self-luminescence and direct illumination, determining the brightness of each of the plurality of visual points includes four cases, and each of the four cases is described below by taking any one of the visual points as an example. For convenience of description, any one of the plurality of visual points is referred to as a target visual point.

In the first case, the target visual point has self-luminescence and direct illumination, and at this time, the brightness of the target visual point may be determined based on the first light energy, the second light energy, and the third light energy of the target visual point.

As an example, the brightness of the target visual point may be determined by the following rendering equation (1) based on the first, second, and third light energies of the target visual point.

In the rendering equation (1), p is the target visual point, ω _o is the outgoing light direction of p point, that is, the direction from p point to the optical center of the virtual camera, ω _i is the direction of the incoming light of p point, and θ _i is the angle between the incoming light and the normal line of p point. L (p, ω _o) is the luminance of the p-point. L _e(p,ω_o) is the second light energy in the direction of the p-point to the optical center of the virtual camera, i.e. the self-luminous light energy. s ² is a sphere with p point as the center and the specified value as radius. f (p, ω _o,ω_i) is a bi-directional reflection function that determines the ratio of incident light energy to outgoing light energy. Objects of different materials have different bi-directional reflection functions.

L _d(p,ω_i) is the third light energy of the p-point in the direction of the incident light, i.e. the light energy of direct illumination. L _i(p,ω_i) is the first light energy of p-point in the direction of the incident light, i.e. the light energy of indirect illumination. cos θ _i is the cosine of the angle between the incident ray and the normal to the surface where the p-point is located. dω _i is the direction of the p-point incident ray.

Alternatively, since the above integration is area integration and there is no analytical solution, the above integration may be solved by multiple iterations using the monte carlo method to obtain the brightness of the target visual point.

In the second case, the target visual point has self-luminescence without direct illumination, and at this time, the brightness of the target visual point may be determined based on the first light energy and the second light energy of the target visual point.

As an example, the brightness of the target visual point may be determined by the following rendering equation (2) based on the first light energy and the second light energy of the target visual point.

The meaning of each parameter in the above rendering equation (2) and the integral calculation method are already described in the first case, and will not be described here again.

In a third case, the target visual point is not self-luminous and has direct illumination, and at this time, the brightness of the target visual point can be determined based on the first light energy and the third light energy of the target visual point.

As an example, the brightness of the target visual point may be determined by the following rendering equation (3) based on the first light energy and the third light energy of the target visual point.

The meaning of each parameter in the above rendering equation (3) and the integral calculation method are already described in the first case, and will not be described here again.

In the fourth case, the target visual point is not self-luminous and is not directly illuminated, and at this time, the brightness of the target visual point may be determined based on the first light energy of the target visual point.

As an example, the brightness of the target visual point may be determined by the following rendering equation (4) based on the first light energy of the target visual point.

The meaning of each parameter in the above rendering equation (4) and the integral calculation method are already described in the first case, and will not be described here again.

And S104, generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visual points.

Based on the foregoing, the plurality of projection images including the target projection image are gray-scale images, and the pixel values of the pixel points in the gray-scale images have a certain relationship with the brightness, so after the brightness of each of the plurality of visible points is determined, the pixel value of the pixel point corresponding to each of the plurality of visible points on the image plane of the virtual camera can be determined based on the brightness of each of the plurality of visible points, thereby generating the image corresponding to the three-dimensional virtual scene.

For example, the monte carlo method is used to iterate the integration corresponding to each of the plurality of visible points by 200 or more times, and the brightness of each of the visible points is calculated. Thereafter, a speckle pattern as shown in fig. 7 can be obtained based on the brightness of each viewable point. Wherein fig. 8 is a speckle pattern of a real scene. By comparing fig. 7 with fig. 8, it can be determined that the effect of the speckle pattern generated using the image generation method of the embodiment of the present application is true, and is not different from the true speckle pattern.

Based on the foregoing description, the brightness of each of the plurality of visible points is related to the light energy of the virtual photon passing through the respective pixel point of the target projected image, and the light energy of each virtual photon is related to the pixel point of the target projected image, so that, using different target projected images, images corresponding to different three-dimensional virtual scenes can be generated.

For example, when an infrared staggered lattice speckle pattern or an infrared random lattice speckle pattern is used as a target projection image, a speckle pattern corresponding to a three-dimensional virtual scene may be generated. When the structured-light fringe pattern is used as the target projection image, a structured-light pattern corresponding to the three-dimensional virtual scene can be generated.

It should be noted that, the image corresponding to the three-dimensional virtual scene may include a monocular image or a binocular image. For example, as shown in fig. 9, the image corresponding to the three-dimensional virtual scene includes a binocular image, that is, includes a left eye view and a right eye view.

In the case where the image corresponding to the three-dimensional virtual scene includes a binocular image, the following two processing methods may be adopted for the binocular image.

In a first processing manner, the coordinates of each of the plurality of visual points in the world coordinate system are converted into coordinates in the camera coordinate system of the virtual camera, and the binocular image and the coordinates of the plurality of visual points in the camera coordinate system are correspondingly stored.

As an example, the extrinsic matrix of the virtual camera may be multiplied by the coordinates of each of the plurality of visual points in the world coordinate system to obtain the coordinates of each of the plurality of visual points in the camera coordinate system of the virtual camera.

For example, for any one of the plurality of visual points, the coordinates of the any one visual point in the camera coordinate system of the virtual camera may be determined according to the following formula (5) based on the coordinates of the any one visual point in the world coordinate system of the three-dimensional virtual scene and the external parameter matrix of the virtual camera.

In the above formula (5), (x _c,y_c,z_c) is the coordinate of the visual point in the camera coordinate system of the virtual camera. (x, y, z) is the coordinates of the visual point in the world coordinate system of the three-dimensional virtual scene. M ₀＝[R_3×3T_3×1 is the extrinsic matrix of the virtual camera, where R _3×3 is the rotation matrix and T _3×1 is the translation matrix.

And in a second processing mode, generating a parallax map based on the binocular image, and storing the binocular image and the parallax map in a corresponding manner.

As an example, for any one of the plurality of visual points, the parallax value of the parallax point corresponding to the pixel point corresponding to the any one visual point may be determined according to the following formula (6) based on the coordinates of the visual point in the world coordinate system of the three-dimensional virtual scene, the distance between the left virtual camera and the right virtual camera, and the focal length in the x direction in the internal reference of any one virtual camera. And determining the parallax value of each parallax point to obtain a parallax map.

disp＝f_xT_xZ_c (6)

In the above formula (6), disp refers to a parallax value of a parallax point corresponding to a pixel point corresponding to the any one of the visible points. f _x refers to a focal length in a transverse axis direction of an internal reference of any virtual camera, for example, may be a focal length in a transverse axis direction of an internal reference of a left virtual camera. T _x refers to the separation between the left and right virtual cameras. Z _c refers to the vertical coordinate of the any one of the visual points in the world coordinate system of the three-dimensional virtual scene.

For example, fig. 10 is a disparity map generated based on a left eye map and a right eye map. The closer an object in the three-dimensional virtual scene is to the lens of the virtual camera, the larger the parallax is, and the whiter the color of the parallax map is.

The binocular image stored in the first processing mode, the coordinates of the plurality of visual points in the camera coordinate system, the binocular image stored in the second processing mode and the parallax map can be widely applied to cameras, games and other fields related to image processing.

As an example, the binocular image may be used to train a neural network model to be trained in the field of cameras. The neural network model to be trained is used for calculating the distance between the target object and the camera. By providing a plurality of inputs and outputs for the neural network model to be trained, the neural network model to be trained can grasp the accurate mapping relation between the inputs and the outputs through deep learning, so that the calculation accuracy of the neural network model to be trained is improved. The neural network model to be trained can be trained through the following two implementation modes.

In a first implementation manner, a stored plurality of binocular images are used as input of a neural network model to be trained, vertical coordinates of a plurality of visual points corresponding to each binocular image in the plurality of binocular images in a camera coordinate system of the virtual camera are used as output of the neural network model to be trained, and the neural network model to be trained is trained. Wherein the plurality of binocular images are images generated based on the three-dimensional virtual scene.

That is, for any binocular image of the plurality of binocular images, the arbitrary binocular image is used as an input of the neural network model to be trained, and vertical coordinates of a plurality of visible points corresponding to the arbitrary binocular image in a camera coordinate system of the virtual camera are used as an output of the neural network model to be trained, so that the neural network model to be trained is trained. After the neural network model to be trained is trained through the binocular images and the visible points corresponding to the binocular images, the training process of the neural network model to be trained can be completed.

In a second implementation, each disparity map of the stored plurality of disparity maps is converted into a depth map, resulting in a plurality of depth maps. And taking the stored multiple binocular images as input of a neural network model to be trained, taking a depth map obtained by converting the parallax map corresponding to each binocular image in the multiple binocular images as output of the neural network model to be trained, and training the neural network model to be trained. Wherein the plurality of binocular images are images generated based on the three-dimensional virtual scene.

That is, for any binocular image of the plurality of binocular images, the input of the neural network model to be trained is the arbitrary binocular image, the depth map obtained by converting the disparity map corresponding to the arbitrary binocular image is the output of the neural network model to be trained, and the neural network model to be trained is trained. And after training the neural network model to be trained through the plurality of binocular images and the depth map obtained through the parallax map conversion corresponding to the plurality of binocular images, the training process of the neural network model to be trained can be completed.

The parallax map may be converted into a corresponding depth map based on relevant parameters of the camera, and the method for converting the parallax map into the depth map may refer to the related art, which is not limited herein.

Because the real value of the ranging is required to be obtained as the output of the neural network model to be trained when the neural network model to be trained is trained, the real value cannot be achieved through manual labeling, and after the binocular image and the depth value are generated by using the image generation method provided by the embodiment of the application, the process can be achieved, and the training of the neural network model to be trained is facilitated. And under the condition of more generated images, the processing precision of the trained neural network model can be improved.

Fig. 11 is a schematic structural diagram of an image generating apparatus according to an embodiment of the present application, which may be implemented as part or all of a computer device by software, hardware, or a combination of both. Referring to fig. 11, the apparatus includes: a first determination module 1101, a transmission module 1102, a second determination module 1103 and a first generation module 1104.

A first determining module 1101, configured to determine coordinates of a plurality of visible points of a three-dimensional virtual scene, where the plurality of visible points are points corresponding to a plurality of pixel points on an image plane of a virtual camera in the three-dimensional virtual scene;

The transmitting module 1102 is configured to transmit virtual photons to the three-dimensional virtual scene with each pixel point in the target projection image as a starting point, where the target projection image is a projection image corresponding to a scene to be simulated currently;

a second determining module 1103 for determining a brightness of each of the plurality of visual points based on the light energy of the virtual photons around each of the plurality of visual points;

A first generating module 1104 is configured to generate an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visible points.

Optionally, the first determining module 1101 includes:

And the first determination submodule is used for determining the coordinates of the plurality of visual points in a reverse ray tracing mode.

Optionally, the first determining submodule includes:

And a second determining unit configured to determine coordinates of a plurality of visible points by using, as the plurality of visible points, a plurality of points at which the plurality of rays intersect with the object surface in the three-dimensional virtual scene.

Optionally, the second determining module 1103 includes:

a second determining sub-module for determining a first light energy of each of the plurality of visual points based on light energy of virtual photons around each of the plurality of visual points;

Optionally, the third determining submodule includes:

A third determining unit configured to determine light energy of a virtual photon emitted to surroundings of each of a plurality of visible points at a plurality of times, the plurality of times being a plurality of different times at which the virtual photon is emitted to the three-dimensional virtual scene, based on light energy of the virtual photon emitted to surroundings of each of the plurality of visible points at the plurality of times;

And a fourth determining unit configured to determine an average value of light energy at the plurality of times for each of the plurality of visual points as a first light energy for a corresponding visual point of the plurality of visual points.

Optionally, the third determining unit is specifically configured to:

Selecting a time from the plurality of times as a target time, determining light energy of each of the plurality of visual points at the target time according to the following operation until light energy of each of the plurality of visual points at each time is determined:

determining the optical energy of each virtual photon of the plurality of virtual photons;

the optical energy of each of the plurality of visual points at the target time is determined based on the coordinates of the plurality of visual points, the coordinates of the plurality of virtual photons, and the optical energy of the virtual photons located around each of the plurality of visual points in the plurality of virtual photons.

Optionally, the third determining unit is specifically configured to:

Determining a pixel value of a corresponding pixel point of each virtual photon in the plurality of virtual photons in the target projection image in a forward ray tracing mode;

The light energy of each virtual photon of the plurality of virtual photons is determined based on the pixel value of the corresponding pixel point of each virtual photon of the plurality of virtual photons in the target projected image and the light energy of the virtual light source used to emit the plurality of virtual photons.

Optionally, the third determining unit is specifically configured to:

selecting one of the plurality of visual points, determining the light energy of the selected visual point at the target time according to the following operation until the light energy of each visual point at the target time is determined:

Determining virtual photons within a specified range based on the coordinates of the selected visual point and the coordinates of the plurality of virtual photons, the specified range being a sphere range with the selected visual point as a center and the specified value as a radius;

the sum of the optical energy of the virtual photons within the specified range is determined as the optical energy of the selected visual point at the target instant.

Optionally, the image corresponding to the three-dimensional virtual scene includes a binocular image, and the coordinates of the plurality of visible points are coordinates of the plurality of visible points in a world coordinate system of the three-dimensional virtual scene;

the apparatus further comprises:

Optionally, the apparatus further comprises:

Optionally, the image corresponding to the three-dimensional virtual scene includes a binocular image; the apparatus further comprises:

Optionally, the apparatus further comprises:

The image generation method provided by the embodiment of the application can simulate any projected image, namely, can simulate any illumination scene so as to obtain the image under the corresponding illumination scene. The result of the generated image is true and is no different from the true image, so the generated image can be used for training the deep learning algorithm in multiple fields so as to improve the precision of the deep learning algorithm. In addition, the image generation method has the advantages that the process is automatic, a virtual three-dimensional scene and a virtual camera are used, and an entity scene and entity equipment are not needed, so that the cost of image generation is reduced, and the efficiency of image generation is improved.

It should be noted that: the image generating apparatus provided in the above embodiment is only exemplified by the division of the above functional modules when generating an image, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. In addition, the image generating apparatus and the image generating method provided in the foregoing embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments, which are not described herein again.

Fig. 12 is a block diagram of a terminal 1200 according to an embodiment of the present application. The terminal 1200 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 1200 may also be referred to as a user device, portable terminal, laptop terminal, desktop terminal, etc.

In general, the terminal 1200 includes: a processor 1201 and a memory 1202.

Processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1201 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). Processor 1201 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1201 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 1201 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.

Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. Memory 1202 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1202 is used to store at least one instruction for execution by processor 1201 to implement the image generation methods provided by the method embodiments of the present application.

In some embodiments, the terminal 1200 may further optionally include: a peripheral interface 1203, and at least one peripheral. The processor 1201, the memory 1202, and the peripheral interface 1203 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 1203 via buses, signal lines, or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1204, touch display 1205, camera 1206, audio circuitry 1207, positioning assembly 1208, and power supply 1209.

The peripheral interface 1203 may be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 1201 and the memory 1202. In some embodiments, the processor 1201, the memory 1202, and the peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 1201, the memory 1202, and the peripheral interface 1203 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 1204 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 1204 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1204 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 1204 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuit 1204 may further include NFC (NEAR FIELD Communication) related circuits, which embodiments of the application are not limited in this respect.

The display 1205 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 1205 is a touch display, the display 1205 also has the ability to collect touch signals at or above the surface of the display 1205. The touch signal may be input as a control signal to the processor 1201 for processing. At this time, the display 1205 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 1205 may be one, providing a front panel of the terminal 1200; in other embodiments, the display 1205 may be at least two, respectively disposed on different surfaces of the terminal 1200 or in a folded design; in still other embodiments, the display 1205 may be a flexible display disposed on a curved surface or a folded surface of the terminal 1200. Even more, the display 1205 may be arranged in an irregular pattern that is not rectangular, i.e., a shaped screen. The display 1205 can be made of LCD (Liquid CRYSTAL DISPLAY), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 1206 is used to capture images or video. Optionally, camera assembly 1206 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 1206 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuitry 1207 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1201 for processing, or inputting the electric signals to the radio frequency circuit 1204 for voice communication. For purposes of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 1200. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 1201 or the radio frequency circuit 1204 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuitry 1207 may also include a headphone jack.

The positioning component 1208 is used to locate the current geographic location of the terminal 1200 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 1208 may be a positioning component based on the united states GPS (Global Positioning System ), the beidou system of china, or the galileo system of russia.

The power supply 1209 is used to power the various components in the terminal 1200. The power source 1209 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power source 1209 comprises a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

It will be appreciated by those skilled in the art that the structure shown in fig. 12 is not limiting and that more or fewer components than shown may be included or certain components may be combined or a different arrangement of components may be employed.

In some embodiments, there is also provided a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of the image generation method of the above embodiments. For example, the computer readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It is noted that the computer readable storage medium mentioned in the embodiments of the present application may be a non-volatile storage medium, in other words, may be a non-transitory storage medium.

It should be understood that all or part of the steps to implement the above-described embodiments may be implemented by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.

That is, in some embodiments, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the image generation method described above.

It should be understood that references herein to "at least one" mean one or more, and "a plurality" means two or more. In the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, in order to facilitate the clear description of the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

The above embodiments are not intended to limit the present application, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present application should be included in the scope of the present application.

Claims

1. An image generation method, the method comprising:

Selecting a target projection image corresponding to the three-dimensional virtual scene from a plurality of stored projection images based on the three-dimensional virtual scene, when the target projection image is irradiated by using a virtual light source, taking each pixel point in the target projection image as a starting point, emitting virtual photons to the three-dimensional virtual scene, wherein the target projection image is a projection image corresponding to a scene to be simulated currently, and is an infrared staggered lattice speckle pattern, an infrared random lattice speckle pattern or a structural light fringe pattern, wherein the infrared staggered lattice speckle pattern is used for simulating the scene irradiated by infrared light in a staggered manner, the infrared random lattice speckle pattern is used for simulating the scene irradiated by infrared light in a random manner, and the structural light fringe pattern is used for simulating the scene irradiated by structural light;

Determining, based on optical energy of virtual photons emitted at a plurality of moments to the surroundings of each of the plurality of visual points, optical energy of each of the plurality of visual points at the plurality of moments, the plurality of moments being a plurality of different moments when virtual photons are emitted to the three-dimensional virtual scene and the number of virtual photons residing in the vicinity of each of the plurality of visual points being different, and optical energy obtained at the plurality of moments by each of the plurality of visual points being different;

determining an average value of light energy of each of the plurality of visual points at the plurality of moments as a first light energy of a corresponding visual point of the plurality of visual points, wherein the first light energy is the light energy of indirect illumination of the visual point;

Determining a second light energy and a third light energy of each of the plurality of visual points, the second light energy being self-luminous light energy of the visual point, the third light energy being light energy of direct illumination of the visual point;

Determining brightness of each of the plurality of visual points based on the first, second and third light energy of each of the plurality of visual points, the brightness being light energy carried by light reflected by the visual point when reaching a light center of the virtual camera;

And generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visible points, wherein the image is a speckle pattern or a structured light pattern corresponding to the target projection image.

2. The method of claim 1, wherein the determining coordinates of a plurality of viewable points of the three-dimensional virtual scene comprises:

3. The method of claim 2, wherein determining coordinates of the plurality of visible points in an inverse ray tracing manner comprises:

4. The method of claim 1, wherein the determining the optical energy of each of the plurality of visual points at the plurality of moments based on the optical energy of the virtual photons emitted at the plurality of moments to the surroundings of each of the plurality of visual points comprises:

5. The method of claim 4, wherein the determining the optical energy of each virtual photon of the plurality of virtual photons comprises:

6. The method of claim 3, wherein the determining the optical energy of each of the plurality of visual points at the target time based on the coordinates of the plurality of visual points, the coordinates of the plurality of virtual photons, and the optical energy of the virtual photons of the plurality of virtual photons that are located around each of the plurality of visual points comprises:

7. The method of any of claims 1-6, wherein the image corresponding to the three-dimensional virtual scene comprises a binocular image, and the coordinates of the plurality of visual points are coordinates of the plurality of visual points in a world coordinate system of the three-dimensional virtual scene;

8. The method of claim 7, wherein the method further comprises:

9. The method of any of claims 1-6, wherein the image corresponding to the three-dimensional virtual scene comprises a binocular image;

generating a disparity map based on the binocular image;

and correspondingly storing the binocular image and the disparity map.

10. The method of claim 9, wherein the method further comprises:

11. An image generation apparatus, the apparatus comprising:

The system comprises an emission module, a virtual light source, a three-dimensional virtual scene, a three-dimensional virtual image acquisition module and a display module, wherein the three-dimensional virtual scene is used for storing a plurality of projection images;

a second determining module configured to determine, based on optical energy of virtual photons emitted to surroundings of each of the plurality of visible points at a plurality of moments, optical energy of each of the plurality of visible points at the plurality of moments, the plurality of moments being a plurality of different moments when virtual photons are emitted to the three-dimensional virtual scene and the number of virtual photons residing in the vicinity of each of the plurality of visible points being different, and optical energy obtained at the plurality of moments by each of the plurality of visible points being different; determining an average value of light energy of each of the plurality of visual points at the plurality of moments as a first light energy of a corresponding visual point of the plurality of visual points, wherein the first light energy is the light energy of indirect illumination of the visual point; determining a second light energy and a third light energy of each of the plurality of visual points, the second light energy being self-luminous light energy of the visual point, the third light energy being light energy of direct illumination of the visual point; determining brightness of each of the plurality of visual points based on the first, second and third light energy of each of the plurality of visual points, the brightness being light energy carried by light reflected by the visual point when reaching a light center of the virtual camera;

The first generation module is used for generating an image corresponding to the three-dimensional virtual scene based on the brightness of each of the plurality of visible points, wherein the image is a speckle pattern or a structured light pattern corresponding to the target projection image.

12. A computer device, characterized in that it comprises a memory for storing a computer program and a processor for executing the computer program stored on the memory for carrying out the steps of the method according to any of the preceding claims 1-10.

13. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program which, when executed by a processor, implements the steps of the method of any of the preceding claims 1-10.