CN116012432A

CN116012432A - Stereoscopic panoramic image generation method and device and computer equipment

Info

Publication number: CN116012432A
Application number: CN202310087696.5A
Authority: CN
Inventors: 李仁义; 邓维峰; 段文科; 周宓; 姚伟
Original assignee: Yunnan Nantian Electronics Information Corp ltd; GUANGZHOU NANTIAN COMPUTER SYSTEM CO Ltd
Current assignee: Yunnan Nantian Electronics Information Corp ltd; GUANGZHOU NANTIAN COMPUTER SYSTEM CO Ltd
Priority date: 2023-01-18
Filing date: 2023-01-18
Publication date: 2023-04-25

Abstract

The application relates to a method, an apparatus, a computer device, a storage medium and a computer program product for generating a stereoscopic panoramic image. The method comprises the following steps: acquiring an image sequence obtained by panoramic shooting operation aiming at a target shooting scene, and fusing at least two images to be processed connected in the image sequence to obtain a fused image; inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image; performing edge sharpening processing on the depth image according to the depth information data corresponding to the depth image to obtain a sharpened depth image; determining edge contour data corresponding to the fused image according to the sharpened depth image; dividing the fused image into a foreground region and a background region according to the edge contour data; and generating a stereoscopic panoramic image aiming at the target shooting scene according to the foreground region and the background region. By adopting the method, the generation efficiency of the stereoscopic panoramic image can be improved.

Description

Stereoscopic panoramic image generation method and device and computer equipment

Technical Field

The present invention relates to the field of computer technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for generating a stereoscopic panoramic image.

Background

Along with the innovation development of science and technology, people acquire social information in daily life, change from original text types to pictures and videos, and the visual material is from real visual shooting or is made by computer software.

Currently, when some scenes are shot, a panoramic view of the target scene cannot be displayed through a single image, and when the two-dimensional images obtained through shooting are shot through a camera or shooting software, stereoscopic features corresponding to objects in the target scene cannot be displayed. If the stereoscopic panoramic image corresponding to the target scene is obtained through video shooting, a professional software tool is also required to be used for manufacturing, and the stereoscopic panoramic image has higher technical requirements for users.

Therefore, the conventional technology has a problem in that the generation efficiency of stereoscopic panoramic images is not high.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a stereoscopic panorama image generation method, apparatus, computer device, computer-readable storage medium, and computer program product that can improve the generation efficiency of stereoscopic panorama images.

A method for generating a stereoscopic panoramic image, the method comprising:

Acquiring an image sequence obtained by panoramic shooting operation aiming at a target shooting scene, and fusing at least two images to be processed connected in the image sequence to obtain a fused image;

inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image;

performing edge sharpening processing on the depth image according to the depth information data corresponding to the depth image to obtain a sharpened depth image;

determining edge contour data corresponding to the fused image according to the sharpened depth image;

dividing the fused image into a foreground region and a background region according to the edge contour data;

and generating a stereoscopic panoramic image aiming at the target shooting scene according to the background edge pixel information corresponding to the background area.

In one embodiment, fusing at least two images to be processed connected in the image sequence to obtain a fused image includes:

determining at least two connected images to be processed in an image sequence;

extracting image characteristics of at least two images to be processed to obtain image characteristic matching points corresponding to the at least two images to be processed;

determining a registration structure between at least two images to be processed according to the image feature matching points;

And according to the registration structure, fusing at least two images to be processed to obtain a fused image.

In one embodiment, determining a registration structure between at least two images to be processed according to the image feature matching points includes:

carrying out homography matrix calculation on the image feature matching points to obtain target image feature matching points; the target image feature matching points are other image feature matching points except for the abnormal image feature matching points in the image feature matching points;

and performing perspective transformation matrix calculation on the target image feature matching points, and determining a registration structure between at least two images to be processed.

In one embodiment, according to the registration structure, fusing at least two images to be processed to obtain a fused image includes:

registering at least two images to be processed according to the registration structure to obtain registered images;

segmenting the registered image into at least one segmented image;

repairing the boundary of each block image to obtain a repaired image;

and carrying out feature fusion on the spliced area of the repaired image to obtain a fused image.

In one embodiment, determining edge contour data corresponding to the fused image according to the sharpened depth image includes:

Thresholding the sharpened depth image to obtain continuous edges and spots;

marking the continuous edges and spots as binary graphics;

according to the binary drawing, eliminating continuous edges and spots with the number less than the preset number of pixels to obtain eliminated image data;

and performing image similarity measurement on the removed image data to obtain edge contour data.

In one embodiment, the background edge pixel information includes background edge color information and background edge depth information, and the generating a stereoscopic panoramic image for a target shooting scene according to the background edge pixel information corresponding to the background area includes:

inputting background edge color information into a pre-trained color repair network model to obtain repair edge color information, and inputting background edge depth information into a pre-trained depth repair network model to obtain repair edge depth information;

generating a repair edge according to the repair edge color information and the repair edge depth information;

and synthesizing the stereoscopic panoramic image aiming at the target shooting scene according to the restored edge.

A stereoscopic panoramic image generation apparatus, comprising:

the fusion module is used for acquiring an image sequence obtained by panoramic shooting operation aiming at a target shooting scene, and fusing at least two images to be processed connected in the image sequence to obtain a fused image;

The input module is used for inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image;

the processing module is used for carrying out edge sharpening processing on the depth image according to the depth information data corresponding to the depth image to obtain a sharpened depth image;

the determining module is used for determining edge contour data corresponding to the fused image according to the sharpened depth image;

the segmentation module is used for segmenting the fused image into a foreground region and a background region according to the edge contour data;

and the generating module is used for generating a stereoscopic panoramic image aiming at the target shooting scene according to the background edge pixel information corresponding to the background area.

A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method described above when executing the computer program.

A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor realizes the steps of the above-mentioned method.

A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method described above.

The method, the device, the computer equipment, the storage medium and the computer program product for generating the stereoscopic panoramic image are used for obtaining an image sequence obtained by panoramic shooting operation aiming at a target shooting scene, fusing at least two images to be processed connected in the image sequence, and obtaining a fused image; inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image; performing edge sharpening processing on the depth image according to the depth information data corresponding to the depth image to obtain a sharpened depth image; determining edge contour data corresponding to the fused image according to the sharpened depth image; dividing the fused image into a foreground region and a background region according to the edge contour data; generating a stereoscopic panoramic image aiming at a target shooting scene according to background edge pixel information corresponding to a background area; therefore, continuous images to be processed in the image sequence are fused into one image, panoramic image synthesis of a target shooting scene can be realized, two-dimensional fused images are converted into three-dimensional panoramic images on the basis of acquiring the two-dimensional fused images, the three-dimensional panoramic images are generated without other tools, and the generation efficiency of the three-dimensional panoramic images is improved.

Drawings

FIG. 1 is an application environment diagram of a method of generating a stereoscopic panoramic image in one embodiment;

FIG. 2 is a flow chart of a method for generating a stereoscopic panoramic image according to an embodiment;

FIG. 3 is a flow diagram of a stereoscopic panoramic image generation process in one embodiment;

FIG. 4 is a dynamic view screenshot of a stereoscopic panoramic image in one embodiment;

FIG. 5 is a schematic illustration of the location of a continuous edge and blob in an image, in one embodiment;

fig. 6 is a flowchart of a method for generating a stereoscopic panoramic image according to another embodiment;

fig. 7 is a block diagram of a stereoscopic panorama image generation apparatus according to an embodiment;

fig. 8 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The method for generating the stereoscopic panoramic image, provided by the embodiment of the application, can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The method comprises the steps that a server 104 acquires an image sequence obtained by panoramic shooting aiming at a target shooting scene, and fuses at least two images to be processed connected in the image sequence to obtain a fused image; the server 104 inputs the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image; the server 104 performs edge sharpening processing on the depth image according to the depth information data corresponding to the depth image to obtain a sharpened depth image; the server 104 determines edge contour data corresponding to the fused image according to the sharpened depth image; the server 104 segments the fused image into a foreground region and a background region according to the edge contour data; the server 104 generates a stereoscopic panoramic image for the target shooting scene according to the background edge pixel information corresponding to the background area. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In one embodiment, as shown in fig. 2, a method for generating a stereoscopic panoramic image is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps:

step S202, obtaining an image sequence obtained by panoramic shooting operation aiming at a target shooting scene, and fusing at least two images to be processed connected in the image sequence to obtain a fused image.

The target shooting scene may be a specific scene to be shot when panoramic shooting is performed.

The image sequence may be a sequence of images continuously captured when capturing a specific scene. For example, when capturing a storefront of a coffee shop, a plurality of images of the storefront of the coffee shop captured continuously are image sequences corresponding to the storefront of the coffee shop. The adjacent two images in the image series may be images having overlapping areas, and the adjacent two images in the image series may be two consecutive images photographed by the photographing apparatus in a fixed photographing movement direction.

Wherein the image to be processed may be each image in the sequence of images.

The fused image may be an image obtained by fusing the images in the image sequence.

In the specific implementation, the server fuses each image in the image sequence into one image by acquiring an image sequence obtained by shooting a specific scene by the image acquisition equipment, namely, the fused image is generated.

Step S204, inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image.

The monocular depth estimation model may be a neural network model for extracting image depth values, among other things. For example, the monocular depth estimation model may be a monocular depth estimation model trained based on a ResNet-50 (a convolutional neural network model) depth residual network model.

The depth image may be an image in which a distance value (depth value) of each point in a specific scene acquired by the image acquisition device is used as a pixel value.

In specific implementation, the server inputs the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image, and then a depth value corresponding to each pixel point in the fused image is obtained.

Step S206, carrying out edge sharpening processing on the depth image according to the depth information data corresponding to the depth image, and obtaining a sharpened depth image.

The depth information data may be a depth value corresponding to each pixel point in the depth image.

The sharpened depth image may be a depth image obtained by sharpening the depth image.

In the specific implementation, the server performs bidirectional median filtering calculation according to the depth image information data corresponding to the fused image to obtain the sharpened depth image.

Step S208, determining edge contour data corresponding to the fused image according to the sharpened depth image.

The edge profile data may refer to position data of an edge profile of an object in an image in the image.

In specific implementation, the server determines edge contour data corresponding to the fused image according to the sharpened image.

Step S210, dividing the fused image into a foreground area and a background area according to the edge contour data.

In the specific implementation, the server divides the fused image into two partial areas, namely a foreground area and a background area according to the depth value corresponding to the acquired edge contour data.

Step S212, generating a stereoscopic panoramic image aiming at the target shooting scene according to the background edge pixel information corresponding to the background area.

In the specific implementation, the server generates repair edge information corresponding to the fused image according to background edge pixel information corresponding to the background area, and the server synthesizes the stereoscopic panoramic image aiming at the target shooting scene according to the repair edge information.

In practical application, the process of generating the stereoscopic panoramic image comprises two parts of fusion of two-dimensional images and generation of three-dimensional images. Fig. 3 exemplarily provides a flow diagram of a stereoscopic panorama image generation process.

In the process of generating the stereoscopic panoramic image, fusion of the two-dimensional image is completed first. The method comprises the steps of extracting image characteristic points of images in an image sequence, matching the characteristic points of the images, calculating a registration structure of the images through the characteristic points, fusing the images according to the registration structure of the images, repairing boundary cracks generated after fusion, repeating the process, and completing fusion of continuous images in the image sequence to obtain fused images, so that fusion of two-dimensional images is completed.

After the fusion of the two-dimensional images is completed, the generation of the three-dimensional images needs to be completed. Generating a layered depth image corresponding to the fused image by calculating the depth information of the fused image, calculating an edge contour corresponding to the fused image according to the layered depth image, further calculating a synthesized color and a depth value of an edge shielding area, merging the synthesized color and the depth value into the generated layered depth image, and repeating the process to complete pixel synthesis of all boundaries in the fused image. And finally, converting the finally synthesized pixels into a three-dimensional grid, and rendering the three-dimensional grid into a new image through a rendering model, namely generating a stereoscopic panoramic image. The stereoscopic panoramic image is a dynamic effect image, see fig. 4, and fig. 4 includes a fused two-dimensional image 402, and further includes a dynamic screenshot 404 of the stereoscopic panoramic image generated by the two-dimensional image 402.

In another embodiment, fusing at least two images to be processed connected in the image sequence to obtain a fused image includes: determining at least two connected images to be processed in an image sequence; extracting image characteristics of at least two images to be processed to obtain image characteristic matching points corresponding to the at least two images to be processed; determining a registration structure between at least two images to be processed according to the image feature matching points; and according to the registration structure, fusing at least two images to be processed to obtain a fused image.

Image features may refer to, among other things, color features, texture features, shape features, and spatial relationship features of an image.

The image feature matching points can be homonymous points between two or more images, and feature information of each feature point in the image feature matching points is similar.

The registration structure may refer to a correspondence between image feature matching points.

In the specific implementation, after determining two connected images to be processed in an image sequence, the server extracts image features of the two images to be processed to obtain image feature matching points corresponding to the two images to be processed, the server determines a registration structure between the two images to be processed according to the image feature matching points, and the server fuses the two images to be processed according to the registration structure.

In practical application, after determining two connected images to be processed in an image sequence, a server extracts feature points of the two images to be processed through a SURF algorithm (speed-Up Robust Features, a robust local feature point detection and description algorithm), calculates to obtain matching feature points corresponding to the two images through a KNN adjacent algorithm (K-Nearest Neighbor, a classification algorithm), obtains image space coordinate transformation parameters through the matching feature points corresponding to the two images, and performs image registration according to the image space coordinate transformation parameters to complete fusion of the two images to be processed. After the server fuses two images connected in the image sequence, the fused images also need to be fused with other images in the image sequence until each image in the image sequence is fused into one image.

According to the technical scheme, the images to be processed connected in the image sequence are subjected to image feature extraction to obtain the corresponding image feature matching points among the images to be processed, so that the registration structure among the images to be processed is determined, the images to be processed are fused according to the registration structure among the images to be processed, the fused images are obtained, the images in the image sequence can be fused into one image, the panoramic image acquisition of a target scene is realized, the generation of a stereoscopic panoramic image is facilitated, and the generation efficiency of the stereoscopic panoramic image is improved.

In another embodiment, determining a registration structure between at least two images to be processed based on image feature matching points includes: carrying out homography matrix calculation on the image feature matching points to obtain target image feature matching points; the target image feature matching points are other image feature matching points except for the abnormal image feature matching points in the image feature matching points; and performing perspective transformation matrix calculation on the target image feature matching points, and determining a registration structure between at least two images to be processed.

Where the homography matrix may refer to a projection matrix from one plane to another. In practical application, the homography matrix can represent the mapping relation between the positions of the corresponding pixel points of the two images.

The target image feature matching point can be an image feature matching point determined after calculation through a homography matrix.

The abnormal image feature matching points can be image feature matching points which are removed after the homography matrix is calculated.

In practical application, the image feature matching points are input into a RANSAC algorithm (Random Sample Consensus, an algorithm for detecting abnormal values in data) to perform homography matrix calculation, according to a calculation result, the calculated interior point data is used as target image feature matching points, and the calculated exterior points are used as abnormal image feature matching points. The target image feature matching points refer to correct image feature matching points, and the abnormal image feature matching points refer to invalid image feature matching points or noise image feature matching points.

Wherein the perspective transformation matrix may characterize a transformation relationship between the pre-perspective image and the post-perspective image. In practical applications, the perspective transformation matrix may refer to a pixel transformation relationship between two images.

In the specific implementation, the server adopts a RANSAC algorithm to perform homography matrix calculation to obtain interior point data, namely target feature matching points are obtained, exterior point data are removed, namely abnormal image feature matching points are removed, the obtained interior point data are input into a DLT algorithm (Direct Linear Transform, an algorithm for establishing a direct linear relation between an image point coordinatometer and space coordinates of a corresponding object point) to perform perspective transformation matrix calculation, and therefore a registration structure between two images to be processed is determined.

According to the technical scheme, the homography matrix calculation is carried out on the image feature matching points, screening is carried out on each image feature matching point pair to obtain the target image feature matching point pair, perspective transformation matrix calculation is carried out according to the target image feature matching point pair, and the registration structure between the images to be processed is determined, so that the images to be processed in the image sequence can be accurately fused, the image fusion efficiency is improved, and the generation efficiency of the stereoscopic panoramic image is improved.

In another embodiment, according to the registration structure, fusing at least two images to be processed to obtain a fused image includes: registering at least two images to be processed according to the registration structure to obtain registered images; segmenting the registered image into at least one segmented image; repairing the boundary of each block image to obtain a repaired image; and carrying out feature fusion on the spliced area of the repaired image to obtain a fused image.

The registered image may be an image formed by matching and overlapping a plurality of images. Image registration may refer to a process of matching and overlaying two or more images acquired at different times, different image acquisition devices, or under different conditions (weather, illuminance, camera position, angle, etc.).

The block image may be a small block image obtained by dividing one image into a plurality of small blocks.

The stitching region may refer to an overlapping region between two images to be stitched. By performing data processing on pixel data of an overlapping area between two images to be spliced, a good image splicing effect can be achieved.

In the specific implementation, a server registers an image to be processed according to a registration structure to obtain a registered image, the server adopts an APAP algorithm (As-project-As-Possible Image Stitching, an image stitching algorithm) to divide the registered image into small images, the server repairs the side seams of the small images, ghost images and cracks at the stitching position of the registered image are eliminated to obtain a repaired image, and the server performs feature fusion on the stitching region of the repaired image to obtain a fused image.

In practical application, after the server obtains the repaired image, in order to eliminate the difference factors of illumination, noise and exposure between two images to be processed, the server adopts Multiband Blending strategy (an image fusion algorithm) to decompose the Laplacian pyramid, the server divides the image into small blocks with different sizes, performs weighted average calculation to obtain a result corresponding to each small block, and then performs pyramid reverse reconstruction to obtain fused image data.

According to the technical scheme of the embodiment, the image to be processed is registered according to the registration structure, the registered image is obtained, the registered image is divided into at least one segmented image, boundary restoration is carried out based on each segmented image, a restored image corresponding to the fused image is obtained, feature fusion is carried out on a splicing area of the restored image, and the fused image is obtained, so that the display of the generated splicing area corresponding to the fused image is more natural, the fused image can be accurately generated, the generation quality of the stereoscopic panoramic image is improved, and the generation efficiency of the stereoscopic panoramic image is improved.

In another embodiment, determining edge contour data corresponding to the fused image according to the sharpened depth image includes: thresholding the sharpened depth image to obtain continuous edges and spots; marking the continuous edges and spots as binary graphics; according to the binary drawing, eliminating continuous edges and spots with the number less than the preset number of pixels to obtain eliminated image data; and performing image similarity measurement on the removed image data to obtain edge contour data.

Wherein the region of the continuous edge corresponding to the blob may be the image region as shown in FIG. 5.

The binary drawing may refer to a picture represented in a binary file.

In specific implementation, a server performs thresholding processing on the sharpened depth image, compares parallaxes of adjacent pixels in the depth image to obtain continuous edges and spots, marks the continuous edges and spots as binary graphics respectively, eliminates the continuous edges and spots with less than preset pixel numbers according to the binary graphics to obtain eliminated image data, and performs LPIPS algorithm (Learned Perceptual Image Patch Similarity, an algorithm for measuring differences between two images) on the eliminated image data to obtain edge contour data. In practical application, the server can reject data of less than 10 pixels according to the binary drawing to obtain the rejected image data.

In practical applications, thresholding refers to removing pixels in an image that have pixel values above or below a threshold. For example, the threshold value is set to 127, and the values of all pixel points with pixel values greater than 127 in the image are set to 255; the value of all pixels in the image with pixel values less than 127 is set to 0.

According to the technical scheme, the sharpened depth image is subjected to thresholding treatment to obtain continuous edges and spots, the continuous edges and spots are marked as binary graphics, the continuous edges and spots with the number smaller than the preset pixel number are removed according to the binary graphics to obtain removed image data, noise data are eliminated, the edge contours in the image are favorably identified, the removed image data are subjected to image similarity measurement to obtain edge contour data, and the fused image is favorably and accurately segmented into a foreground area and a background area, so that the repair accuracy of the edges of the fused image is improved, and the generation efficiency of the stereoscopic panoramic image is improved.

In another embodiment, the background edge pixel information includes background edge color information and background edge depth information, and generating a stereoscopic panoramic image for a target shooting scene according to the background edge pixel information corresponding to the background area includes: inputting background edge color information into a pre-trained color repair network model to obtain repair edge color information, and inputting background edge depth information into a pre-trained depth repair network model to obtain repair edge depth information; generating a repair edge according to the repair edge color information and the repair edge depth information; and synthesizing the stereoscopic panoramic image aiming at the target shooting scene according to the restored edge.

The background edge color information may refer to the color of the pixel point of the background edge area.

The background edge depth information may refer to a depth value of a pixel point of the background edge area.

The repair edge color information may be a color of the repair edge pixel point.

The repair edge depth information may be a depth value of the repair edge pixel point.

In the specific implementation, the server inputs background edge color information to a pre-trained color restoration network model to generate restored edge color, the server inputs background edge depth information to a pre-trained depth restoration network model to generate a restored edge depth value, the server carries out edge restoration according to the restored edge color and the restored edge depth value to obtain a restored edge, and the server synthesizes a stereoscopic panoramic image aiming at a target shooting scene according to pixel point information corresponding to the restored edge.

In order to facilitate understanding of those skilled in the art, the following provides a method for generating a stereoscopic panoramic image for a target shooting scene according to background edge pixel information corresponding to a background area.

Firstly, according to a flooding algorithm, expanding an edge area corresponding to the fused image by 5 pixels for filling, and synthesizing context data of the edge area of the background area to obtain background edge area pixel data.

Then, the background edge region pixel data is input into an edge repair network model to generate a repair edge. The edge repair network model comprises a color repair network model and a depth repair network model. Inputting the color of the edge region corresponding to the fused image into a color restoration network model to generate a restored color, inputting the depth value of the edge region corresponding to the fused image into a depth restoration network model by a server to generate a restored depth value, repeatedly applying the color restoration network model and the depth restoration network model by the server until the restored edge is not generated, and converting the pixel point corresponding to the restored edge into a mesh to render a complementary difference value region by the server, so that the stereoscopic panoramic image is synthesized.

According to the technical scheme, the background edge color information is input to the pre-trained color restoration network model to obtain restoration edge color information, color synthesis of restoration edges can be restrained, the background edge depth information is input to the pre-trained depth restoration network model to obtain restoration edge depth information, depth value synthesis of restoration edges can be restrained, restoration edges are generated according to the restoration edge color information and the restoration edge depth information, color information and depth information corresponding to the restoration edges are combined into layered depth images corresponding to original fused images, stereoscopic panoramic images can be accurately generated, and generation efficiency of the stereoscopic panoramic images is improved.

In another embodiment, as shown in fig. 6, a method for generating a stereoscopic panoramic image is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps:

step S602, obtaining an image sequence obtained by performing panoramic shooting operation on a target shooting scene, and determining at least two connected images to be processed in the image sequence.

Step S604, extracting image features of at least two images to be processed, and obtaining image feature matching points corresponding to the at least two images to be processed.

Step S606, determining a registration structure between at least two images to be processed according to the image feature matching points.

Step S608, fusing at least two images to be processed according to the registration structure to obtain a fused image.

Step S610, inputting the fused image into a pre-trained monocular depth estimation model to obtain a depth image corresponding to the fused image.

Step S612, performing edge sharpening processing on the depth image according to the depth information data corresponding to the depth image, and obtaining a sharpened depth image.

Step S614, according to the sharpened depth image, edge contour data corresponding to the fused image is determined.

Step S616, the fused image is segmented into a foreground region and a background region according to the edge contour data.

Step S618, generating a stereoscopic panoramic image aiming at the target shooting scene according to the background edge pixel information corresponding to the background area.

The specific limitation of the above steps may be referred to as specific limitation of a method for generating a stereoscopic panoramic image.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a stereoscopic panoramic image generation device for realizing the stereoscopic panoramic image generation method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the device for generating a stereoscopic panoramic image or stereoscopic panoramic images provided below may refer to the limitation of the method for generating a stereoscopic panoramic image hereinabove, and will not be repeated here.

In one embodiment, as shown in fig. 7, there is provided a generation apparatus of a stereoscopic panorama image, comprising:

the fusion module 702 is configured to obtain an image sequence obtained by performing panoramic shooting on a target shooting scene, fuse at least two images to be processed connected in the image sequence, and obtain a fused image;

the input module 704 is configured to input the fused image to a pre-trained monocular depth estimation model, so as to obtain a depth image corresponding to the fused image;

the processing module 706 is configured to perform edge sharpening processing on the depth image according to depth information data corresponding to the depth image, to obtain a sharpened depth image;

A determining module 708, configured to determine edge contour data corresponding to the fused image according to the sharpened depth image;

a segmentation module 710, configured to segment the fused image into a foreground region and a background region according to the edge contour data;

the generating module 712 is configured to generate a stereoscopic panoramic image for the target shooting scene according to the background edge pixel information corresponding to the background area.

In one embodiment, the fusion module 702 is specifically configured to determine at least two images to be processed that are connected in the image sequence; extracting image characteristics of at least two images to be processed to obtain image characteristic matching points corresponding to the at least two images to be processed; determining a registration structure between at least two images to be processed according to the image feature matching points; and according to the registration structure, fusing at least two images to be processed to obtain a fused image.

In one embodiment, the fusion module 702 is specifically configured to perform homography matrix calculation on the image feature matching points to obtain target image feature matching points; the target image feature matching points are other image feature matching points except for the abnormal image feature matching points in the image feature matching points; and performing perspective transformation matrix calculation on the target image feature matching points, and determining a registration structure between at least two images to be processed.

In one embodiment, the fusion module 702 is specifically configured to register at least two images to be processed according to a registration structure, so as to obtain a registered image; segmenting the registered image into at least one segmented image; repairing the boundary of each block image to obtain a repaired image; and carrying out feature fusion on the spliced area of the repaired image to obtain a fused image.

In one embodiment, the determining module 708 is specifically configured to thresholde the sharpened depth image to obtain continuous edges and spots; marking the continuous edges and spots as binary graphics; according to the binary drawing, eliminating continuous edges and spots with the number less than the preset number of pixels to obtain eliminated image data; and performing image similarity measurement on the removed image data to obtain edge contour data.

In one embodiment, the background edge pixel information includes background edge color information and background edge depth information, and the generating module 712 is specifically configured to input the background edge color information into a pre-trained color repair network model to obtain repair edge color information, and input the background edge depth information into the pre-trained depth repair network model to obtain repair edge depth information; generating a repair edge according to the repair edge color information and the repair edge depth information; and synthesizing the stereoscopic panoramic image aiming at the target shooting scene according to the restored edge.

Each of the modules in the stereoscopic panorama image generation apparatus described above may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 8. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store XX data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of generating a stereoscopic panoramic image.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, including a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of a method of generating a stereoscopic panoramic image as described above. The steps of a stereoscopic panorama image generation method herein may be the steps of a stereoscopic panorama image generation method of the above-described embodiments.

In one embodiment, a computer-readable storage medium is provided, in which a computer program is stored, which, when executed by a processor, causes the processor to perform the steps of a method for generating a stereoscopic panoramic image as described above. The steps of a stereoscopic panorama image generation method herein may be the steps of a stereoscopic panorama image generation method of the above-described embodiments.

In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, causes the processor to perform the steps of a method of generating a stereoscopic panoramic image as described above. The steps of a stereoscopic panorama image generation method herein may be the steps of a stereoscopic panorama image generation method of the above-described embodiments.

It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method for generating a stereoscopic panoramic image, the method comprising:

2. The method according to claim 1, wherein the fusing at least two images to be processed connected in the image sequence to obtain a fused image comprises:

determining at least two connected images to be processed in the image sequence;

extracting image characteristics of the at least two images to be processed to obtain image characteristic matching points corresponding to the at least two images to be processed;

determining a registration structure between the at least two images to be processed according to the image feature matching points;

and according to the registration structure, fusing the at least two images to be processed to obtain a fused image.

3. The method according to claim 2, wherein said determining a registration structure between the at least two images to be processed from the image feature matching points comprises:

and performing perspective transformation matrix calculation on the target image feature matching points, and determining a registration structure between the at least two images to be processed.

4. The method according to claim 2, wherein the fusing the at least two images to be processed according to the registration structure to obtain a fused image includes:

registering the at least two images to be processed according to the registration structure to obtain registered images;

segmenting the registered image into at least one segmented image;

repairing the boundary of each block image to obtain a repaired image;

5. The method of claim 1, wherein determining edge profile data corresponding to the fused image from the sharpened depth image comprises:

Thresholding the sharpened depth image to obtain continuous edges and spots;

marking the continuous edge and the blob as a binary map;

according to the binary drawing, rejecting the continuous edges and the spots which are less than the preset pixel number to obtain rejected image data;

6. The method of claim 1, wherein the background edge pixel information includes background edge color information and background edge depth information, and wherein the generating a stereoscopic panoramic image for the target photographed scene from the background edge pixel information corresponding to the background area includes:

inputting the background edge color information into a pre-trained color repair network model to obtain repair edge color information, and inputting the background edge depth information into a pre-trained depth repair network model to obtain repair edge depth information;

and synthesizing the stereoscopic panoramic image aiming at the target shooting scene according to the repair edge.

7. A stereoscopic panoramic image generation apparatus, the apparatus comprising:

and the generation module is used for generating a stereoscopic panoramic image aiming at the target shooting scene according to the background edge pixel information corresponding to the background area.

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.