CN117058520A

CN117058520A - Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium

Info

Publication number: CN117058520A
Application number: CN202311022471.8A
Authority: CN
Inventors: 张恩泽; 成茵; 胡志发; 窦诚诚; 张现阳; 焦坦
Original assignee: Chengdu Visionertech Co ltd
Current assignee: Chengdu Visionertech Co ltd
Priority date: 2023-08-14
Filing date: 2023-08-14
Publication date: 2023-11-14

Abstract

The application discloses a three-dimensional panoramic image scene structure identification method, a system, equipment and a storage medium, relates to the identification of a three-dimensional panoramic image in the technical field of image identification, and aims to solve the technical problems of high acquisition difficulty of training sample images and low training efficiency of a neural network model in the prior art. The method comprises the steps of collecting sample data, building and training a deep neural network model, and converting a visible light image into a point cloud image by adopting a specific coordinate conversion method, so that a large number of point cloud images can be obtained more conveniently and rapidly, the point cloud images are used as sample data in training the neural network model, the acquisition difficulty of training the sample image is lower, and the training efficiency of the neural network model is higher.

Description

Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium

Technical Field

The application belongs to the technical field of image recognition, relates to recognition of a three-dimensional panoramic image, and relates to a scene structure recognition method, system, equipment and storage medium based on the three-dimensional panoramic image.

Background

In recent years, with the development of information technology and artificial intelligence machine learning technology, image recognition technology is advancing, and the application of the image recognition technology in various fields is gradually maturing.

When the artificial intelligence is adopted for image recognition, the labeled sample data is required to be adopted for training the neural network model in the early stage, and then the trained neural network model is adopted for image recognition. In the prior art, the neural network model is mostly used for recognizing and classifying images such as medical images, face images and the like, and is rarely applied to recognizing three-dimensional panoramic images. In the prior art, when a three-dimensional panoramic image is generated, one or more groups of photos shot by 360 degrees of a camera ring are spliced into a panoramic image, after the panoramic image is spliced and imaged, a series of mathematical calculations are carried out to obtain a rectangular projection image or a cube image of the spherical panoramic image, and then the real scene of omnibearing interactive viewing is restored and displayed through a computer technology.

The application patent application with the application number of 201410675115.0 discloses an all-weather active panoramic sensing device and a 3D panoramic modeling method, which comprise a mobile body laser light source, a multi-output omnibearing vision sensor and a microprocessor; the moving body laser light source is used for generating a three-dimensional body structure projection light source; a hyperboloid mirror surface and two camera units are arranged in the multielement output omnibearing vision sensor; the refraction and reflection light path of the hyperbolic mirror is provided with a polarization beam splitter prism which is used for dividing refraction and reflection light of the hyperbolic mirror into light containing a certain polarization component and light not containing a certain polarization component; the two camera units are respectively positioned on the reflection light path and the transmission light path of the polarization beam splitter prism and are used for respectively acquiring a first panoramic video image only containing polarized light information and a second panoramic video image containing light intensity information; and the microprocessor is used for fusing the point cloud geometric information in the first panoramic video image and the color information in the second panoramic video image and constructing a panoramic 3D model.

If one or more groups of photos taken by adopting a camera ring at 360 degrees are spliced into a panoramic image to be used as a training sample of the neural network model, a large amount of sample data is required during training according to the neural network model, so that the number of photos to be taken is very large, and the calculated amount for generating the sample data is very large; if the all-weather active panoramic sensing device and the 3D panoramic modeling method are adopted to generate training samples of the neural network model, more point cloud images and visible light images are shot, fused and corrected in the same way; this causes problems of the neural network model that the training sample image acquisition difficulty is high and the training efficiency of the neural network model is low.

Disclosure of Invention

The application aims at: in order to solve the technical problems of the prior art that the acquisition difficulty of training sample images is high and the training efficiency of neural network models is low, the application provides a three-dimensional panoramic image scene structure identification method, a system, equipment and a storage medium.

The application adopts the following technical scheme for realizing the purposes:

a three-dimensional panoramic image scene structure identification method comprises the following steps:

step S1, obtaining sample data

Acquiring scene video image data through panoramic monitoring equipment, and converting scene element information in video image sample data into scene element information in point cloud image sample data to form tag data;

s2, building a deep neural network model

Building a deep neural network model;

step S3, training a deep neural network model

Training the deep neural network model constructed in the step S2 by adopting the point cloud image sample data and the label data in the step S1;

step S4, real-time image recognition

Acquiring point cloud image data to be identified, inputting the point cloud image data to be identified into a deep neural network model trained mature in the step S3, and outputting an identification result by the deep neural network model;

in step S1, the visible light image sample data is converted into point cloud image sample data, and the specific conversion method is as follows:

step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;

step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;

and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.

Further, in the mapping in step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using the relationship mapping matrix R, and the relationship mapping matrix R is obtained by:

step S121: according to the requirement of covering the whole scene, adopting a plurality of groups of imaging systems to install an imaging sensing system; each imaging system comprises an optical imaging module and a laser scanning module;

step S122: the image acquired by the imaging sensing system installed in the step S121 is adopted to form a 3D depth image, and the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system are determined;

step S123: the relationship mapping matrix R between the optical imaging module and the laser scanning module in each imaging system is generated according to the relative offset and the rotation angle determined in step S122.

Further, the method for generating the relation mapping matrix R is as follows:

R＝R _x (α)R _y (β)R _x (θ)；

wherein alpha is the rotation angle of the target coordinate system b relative to the coordinate system a around the X axis, beta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Y axis, theta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Z axis, the coordinate system a is the three-dimensional coordinate system for acquiring data by the optical imaging module, and the target coordinate system b is the three-dimensional coordinate system for acquiring data by the laser scanning module.

A three-dimensional panoramic image scene structure identification system, comprising:

the sample data acquisition module is used for acquiring scene video image data through the panoramic monitoring equipment, converting scene element information in the video image sample data into scene element information in the point cloud image sample data, and forming tag data;

the deep neural network model building module is used for building a deep neural network model;

the deep neural network model training module is used for training the deep neural network model built by the deep neural network model building module by adopting the point cloud image sample data and the label data in the sample data acquisition module;

the real-time image recognition module is used for acquiring point cloud image data to be recognized and inputting the point cloud image data into the deep neural network model training module to train a mature deep neural network model, and the deep neural network model outputs a recognition result;

the sample data acquisition module converts visible light image sample data into point cloud image sample data, and the specific conversion method comprises the following steps:

A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method described above.

A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method described above.

The beneficial effects of the application are as follows:

in the application, compared with the point cloud image, the visible light image is easier to obtain; and then, a specific coordinate conversion method is adopted to convert the visible light image into a point cloud image, so that a large number of point cloud images can be obtained more conveniently and rapidly, the point cloud images are used as sample data in the training of the neural network model, the acquisition difficulty of the training sample image is lower, and the training efficiency of the neural network model is higher.

In the application, a relation mapping matrix R between an optical imaging module and a laser scanning module in each imaging system is generated, and coordinate data between two different coordinates is converted in real time according to the relation mapping matrix R; because the relation mapping matrix R for converting data between two different coordinate systems in real time is established, the imaging calculation amount is greatly reduced when the three-dimensional panoramic imaging is carried out, the panoramic imaging efficiency is obviously improved, and the imaging precision is also greatly improved.

Drawings

FIG. 1 is a schematic flow chart of the method of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Example 1

step S1, obtaining sample data

s2, building a deep neural network model

Common neural network models include: convolutional Neural Network (CNN), recurrent Neural Network (RNN), deep Neural Network (DNN), etc. The embodiment is based on feature recognition of point cloud data, so that a deep neural network model is built, and feature recognition is carried out on the point cloud image data through the built deep neural network model. In this embodiment, the structure of the deep neural network model is not required, and the structure of the existing deep neural network model is adopted.

Step S3, training a deep neural network model

And training the deep neural network model constructed in the step S2 by adopting the point cloud image sample data and the label data in the step S1.

In this embodiment, when the deep neural network model is trained, the training method is not required, and the existing training method is adopted.

The training method comprises the following steps:

1. and loading and processing data, creating a DataLoader, processing the whole data into a small block batch_size form by using the DataLoader class, performing iterative loop subsequently, and inputting the iterative loop into a model for training.

2、Define Neural Network

Define its own model and instantiate.

The Loss Function (mean square error cross entropy error, mini-batch learning), optimizer (SGD (random gradient descent method)) Momentum, adaGrad, RMSprop, adam used is defined.

3. Training, validation, testing (training, validation, test).

Step S4, real-time image recognition

because the spatial range covered by a single imaging module is 80 degrees by 60 degrees, a plurality of imaging modules are required to be installed in order to realize full scene coverage, each module has a position coordinate system, and the modules are mutually independent. Therefore, in this embodiment, the imaging sensing system includes at least 3 imaging systems, each imaging system includes at least 1 optical imaging module and 1 laser scanning module, the optical imaging module adopts a double-fisheye lens single sensor structure to collect panoramic images of a scene, the laser scanning module adopts a planar array laser TOF module, and the imaging systems work in parallel.

According to the requirement of covering the whole scene, a plurality of groups of imaging systems are adopted, each imaging system has a corresponding installation position, and the imaging systems are installed to form an imaging sensing system. In addition, the optical imaging modules in each imaging system have a corresponding coordinate system a, and the laser scanning modules in each imaging system have a corresponding target coordinate system b, except that each imaging system has a respective position coordinate system.

the imaging sensing system installed in the step S121 works, and the positions and the postures of the optical imaging module and the laser scanning module in each imaging system are adjusted, so that the images acquired by a plurality of imaging systems can finally form a 3D depth image covered by a full scene; and determining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system according to the current imaging conditions of the optical imaging module and the laser scanning module in each imaging system.

In determining the relative offset and rotation angle between the optical imaging module and the laser scanning module in each imaging system, the specific method is as follows:

step S122-1, an optical imaging module in each imaging system acquires a corresponding visible light image, and a laser scanning module in each imaging system acquires a corresponding laser scanning point cloud;

and step S122-2, calibrating according to the visible light image acquired by the optical imaging module in each imaging system and the laser scanning point cloud acquired by the laser scanning module, respectively, and obtaining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system.

The generation method of the relation mapping matrix R comprises the following steps:

R＝R _x (α)R _y (β)R _z (θ)；

In addition, the optical imaging module and the laser scanning module are used for carrying out picture calibration on a 3D depth image model, the visible light 2D image formed by the optical imaging module and the 3D imaging image formed by the laser scanning module are subjected to region fusion, as the pixels of the two images are different, the pixels of the visible light 2D image are generally higher, the pixels are required to be correlated, the surrounding 3D imaging image information can be subjected to balanced amplification resolution after being correlated, in practical application, the 3D imaging image is usually subjected to difference amplification in matching, and the amplification can generate multiple pairs of pixels of the visible light 2D image and pixels of the 3D imaging image, so that the processing cannot influence post-processing.

And calibrating the overlapping area between the 3D depth images formed by each imaging system, and correspondingly performing multi-picture stitching through the characteristic points to realize a whole 360-degree panoramic 3D image, wherein the real-time file contains image information and corresponding depth information.

Example 2

A three-dimensional panoramic image scene structure identification system comprising the steps of:

the sample data acquisition module is used for acquiring scene video image data through the panoramic monitoring equipment, and converting scene element information in the video image sample data into scene element information in the point cloud image sample data to form tag data.

The deep neural network model building module is used for building the deep neural network model.

The deep neural network model training module is used for training the deep neural network model built by the deep neural network model building module by adopting the point cloud image sample data and the label data in the sample data acquisition module.

The training method comprises the following steps:

2、Define Neural Network

Define its own model and instantiate.

3. Training, validation, testing (training, validation, test).

R＝R _x (α)R _y (β)R _z (θ)；

Example 3

The present embodiment provides a computer device including a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the three-dimensional panoramic image scene structure identification method described above.

The computer equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or D interface display memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the memory may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like. Of course, the memory may also include both internal storage units of the computer device and external storage devices. In this embodiment, the memory is often used to store an operating system and various application software installed on the computer device, for example, program codes of the three-dimensional panoramic image scene structure identification method, etc. In addition, the memory may be used to temporarily store various types of data that have been output or are to be output.

The processor may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to execute the program code stored in the memory or process data, for example, execute the program code of the three-dimensional panoramic image scene structure identification method.

Example 4

The present embodiment provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, causes the processor to execute the steps of the three-dimensional panoramic image scene structure identification method described above.

Wherein the computer-readable storage medium stores an interface display program executable by at least one processor to cause the at least one processor to perform the steps of the three-dimensional panoramic image scene structure identification method as described above.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal device (which may be a mobile phone, a computer, a server or a network device, etc.) to perform the three-dimensional panoramic image scene structure identification method according to the embodiment of the present application.

Claims

1. The three-dimensional panoramic image scene structure identification method is characterized by comprising the following steps of:

step S1, obtaining sample data

s2, building a deep neural network model

Building a deep neural network model;

step S3, training a deep neural network model

step S4, real-time image recognition

2. The method for recognizing a three-dimensional panoramic image scene structure according to claim 1, wherein in the step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using a relationship mapping matrix R, and the relationship mapping matrix R is obtained by:

3. The method for identifying a three-dimensional panoramic image scene structure according to claim 2, wherein the method for generating the relationship mapping matrix R comprises the following steps:

R＝R _x (α)R _y (β)R _z (θ)；

4. A three-dimensional panoramic image scene structure identification system, comprising:

5. A computer device, characterized by: comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method according to any one of claims 1 to 3.

6. A computer-readable storage medium, characterized by: a computer program is stored which, when executed by a processor, causes the processor to perform the steps of the method according to any one of claims 1 to 3.