CN117058520A - Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium - Google Patents
Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium Download PDFInfo
- Publication number
- CN117058520A CN117058520A CN202311022471.8A CN202311022471A CN117058520A CN 117058520 A CN117058520 A CN 117058520A CN 202311022471 A CN202311022471 A CN 202311022471A CN 117058520 A CN117058520 A CN 117058520A
- Authority
- CN
- China
- Prior art keywords
- coordinate system
- neural network
- sample data
- network model
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000003062 neural network model Methods 0.000 claims abstract description 68
- 238000012549 training Methods 0.000 claims abstract description 44
- 238000006243 chemical reaction Methods 0.000 claims abstract description 8
- 238000003384 imaging method Methods 0.000 claims description 87
- 238000012634 optical imaging Methods 0.000 claims description 41
- 238000013507 mapping Methods 0.000 claims description 28
- 239000011159 matrix material Substances 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 6
- 230000000875 corresponding effect Effects 0.000 description 12
- 238000012545 processing Methods 0.000 description 9
- 230000003321 amplification Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000002596 correlated effect Effects 0.000 description 4
- 230000010287 polarization Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 230000036544 posture Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B26/00—Optical devices or arrangements for the control of light using movable or deformable optical elements
- G02B26/08—Optical devices or arrangements for the control of light using movable or deformable optical elements for controlling the direction of light
- G02B26/10—Scanning systems
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01B—MEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
- G01B11/00—Measuring arrangements characterised by the use of optical techniques
- G01B11/24—Measuring arrangements characterised by the use of optical techniques for measuring contours or curvatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Optics & Photonics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Image Processing (AREA)
Abstract
The application discloses a three-dimensional panoramic image scene structure identification method, a system, equipment and a storage medium, relates to the identification of a three-dimensional panoramic image in the technical field of image identification, and aims to solve the technical problems of high acquisition difficulty of training sample images and low training efficiency of a neural network model in the prior art. The method comprises the steps of collecting sample data, building and training a deep neural network model, and converting a visible light image into a point cloud image by adopting a specific coordinate conversion method, so that a large number of point cloud images can be obtained more conveniently and rapidly, the point cloud images are used as sample data in training the neural network model, the acquisition difficulty of training the sample image is lower, and the training efficiency of the neural network model is higher.
Description
Technical Field
The application belongs to the technical field of image recognition, relates to recognition of a three-dimensional panoramic image, and relates to a scene structure recognition method, system, equipment and storage medium based on the three-dimensional panoramic image.
Background
In recent years, with the development of information technology and artificial intelligence machine learning technology, image recognition technology is advancing, and the application of the image recognition technology in various fields is gradually maturing.
When the artificial intelligence is adopted for image recognition, the labeled sample data is required to be adopted for training the neural network model in the early stage, and then the trained neural network model is adopted for image recognition. In the prior art, the neural network model is mostly used for recognizing and classifying images such as medical images, face images and the like, and is rarely applied to recognizing three-dimensional panoramic images. In the prior art, when a three-dimensional panoramic image is generated, one or more groups of photos shot by 360 degrees of a camera ring are spliced into a panoramic image, after the panoramic image is spliced and imaged, a series of mathematical calculations are carried out to obtain a rectangular projection image or a cube image of the spherical panoramic image, and then the real scene of omnibearing interactive viewing is restored and displayed through a computer technology.
The application patent application with the application number of 201410675115.0 discloses an all-weather active panoramic sensing device and a 3D panoramic modeling method, which comprise a mobile body laser light source, a multi-output omnibearing vision sensor and a microprocessor; the moving body laser light source is used for generating a three-dimensional body structure projection light source; a hyperboloid mirror surface and two camera units are arranged in the multielement output omnibearing vision sensor; the refraction and reflection light path of the hyperbolic mirror is provided with a polarization beam splitter prism which is used for dividing refraction and reflection light of the hyperbolic mirror into light containing a certain polarization component and light not containing a certain polarization component; the two camera units are respectively positioned on the reflection light path and the transmission light path of the polarization beam splitter prism and are used for respectively acquiring a first panoramic video image only containing polarized light information and a second panoramic video image containing light intensity information; and the microprocessor is used for fusing the point cloud geometric information in the first panoramic video image and the color information in the second panoramic video image and constructing a panoramic 3D model.
If one or more groups of photos taken by adopting a camera ring at 360 degrees are spliced into a panoramic image to be used as a training sample of the neural network model, a large amount of sample data is required during training according to the neural network model, so that the number of photos to be taken is very large, and the calculated amount for generating the sample data is very large; if the all-weather active panoramic sensing device and the 3D panoramic modeling method are adopted to generate training samples of the neural network model, more point cloud images and visible light images are shot, fused and corrected in the same way; this causes problems of the neural network model that the training sample image acquisition difficulty is high and the training efficiency of the neural network model is low.
Disclosure of Invention
The application aims at: in order to solve the technical problems of the prior art that the acquisition difficulty of training sample images is high and the training efficiency of neural network models is low, the application provides a three-dimensional panoramic image scene structure identification method, a system, equipment and a storage medium.
The application adopts the following technical scheme for realizing the purposes:
a three-dimensional panoramic image scene structure identification method comprises the following steps:
step S1, obtaining sample data
Acquiring scene video image data through panoramic monitoring equipment, and converting scene element information in video image sample data into scene element information in point cloud image sample data to form tag data;
s2, building a deep neural network model
Building a deep neural network model;
step S3, training a deep neural network model
Training the deep neural network model constructed in the step S2 by adopting the point cloud image sample data and the label data in the step S1;
step S4, real-time image recognition
Acquiring point cloud image data to be identified, inputting the point cloud image data to be identified into a deep neural network model trained mature in the step S3, and outputting an identification result by the deep neural network model;
in step S1, the visible light image sample data is converted into point cloud image sample data, and the specific conversion method is as follows:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
Further, in the mapping in step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using the relationship mapping matrix R, and the relationship mapping matrix R is obtained by:
step S121: according to the requirement of covering the whole scene, adopting a plurality of groups of imaging systems to install an imaging sensing system; each imaging system comprises an optical imaging module and a laser scanning module;
step S122: the image acquired by the imaging sensing system installed in the step S121 is adopted to form a 3D depth image, and the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system are determined;
step S123: the relationship mapping matrix R between the optical imaging module and the laser scanning module in each imaging system is generated according to the relative offset and the rotation angle determined in step S122.
Further, the method for generating the relation mapping matrix R is as follows:
R=R x (α)R y (β)R x (θ);
wherein alpha is the rotation angle of the target coordinate system b relative to the coordinate system a around the X axis, beta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Y axis, theta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Z axis, the coordinate system a is the three-dimensional coordinate system for acquiring data by the optical imaging module, and the target coordinate system b is the three-dimensional coordinate system for acquiring data by the laser scanning module.
A three-dimensional panoramic image scene structure identification system, comprising:
the sample data acquisition module is used for acquiring scene video image data through the panoramic monitoring equipment, converting scene element information in the video image sample data into scene element information in the point cloud image sample data, and forming tag data;
the deep neural network model building module is used for building a deep neural network model;
the deep neural network model training module is used for training the deep neural network model built by the deep neural network model building module by adopting the point cloud image sample data and the label data in the sample data acquisition module;
the real-time image recognition module is used for acquiring point cloud image data to be recognized and inputting the point cloud image data into the deep neural network model training module to train a mature deep neural network model, and the deep neural network model outputs a recognition result;
the sample data acquisition module converts visible light image sample data into point cloud image sample data, and the specific conversion method comprises the following steps:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method described above.
A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method described above.
The beneficial effects of the application are as follows:
in the application, compared with the point cloud image, the visible light image is easier to obtain; and then, a specific coordinate conversion method is adopted to convert the visible light image into a point cloud image, so that a large number of point cloud images can be obtained more conveniently and rapidly, the point cloud images are used as sample data in the training of the neural network model, the acquisition difficulty of the training sample image is lower, and the training efficiency of the neural network model is higher.
In the application, a relation mapping matrix R between an optical imaging module and a laser scanning module in each imaging system is generated, and coordinate data between two different coordinates is converted in real time according to the relation mapping matrix R; because the relation mapping matrix R for converting data between two different coordinate systems in real time is established, the imaging calculation amount is greatly reduced when the three-dimensional panoramic imaging is carried out, the panoramic imaging efficiency is obviously improved, and the imaging precision is also greatly improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Example 1
A three-dimensional panoramic image scene structure identification method comprises the following steps:
step S1, obtaining sample data
Acquiring scene video image data through panoramic monitoring equipment, and converting scene element information in video image sample data into scene element information in point cloud image sample data to form tag data;
s2, building a deep neural network model
Common neural network models include: convolutional Neural Network (CNN), recurrent Neural Network (RNN), deep Neural Network (DNN), etc. The embodiment is based on feature recognition of point cloud data, so that a deep neural network model is built, and feature recognition is carried out on the point cloud image data through the built deep neural network model. In this embodiment, the structure of the deep neural network model is not required, and the structure of the existing deep neural network model is adopted.
Step S3, training a deep neural network model
And training the deep neural network model constructed in the step S2 by adopting the point cloud image sample data and the label data in the step S1.
In this embodiment, when the deep neural network model is trained, the training method is not required, and the existing training method is adopted.
The training method comprises the following steps:
1. and loading and processing data, creating a DataLoader, processing the whole data into a small block batch_size form by using the DataLoader class, performing iterative loop subsequently, and inputting the iterative loop into a model for training.
2、Define Neural Network
Define its own model and instantiate.
The Loss Function (mean square error cross entropy error, mini-batch learning), optimizer (SGD (random gradient descent method)) Momentum, adaGrad, RMSprop, adam used is defined.
3. Training, validation, testing (training, validation, test).
Step S4, real-time image recognition
Acquiring point cloud image data to be identified, inputting the point cloud image data to be identified into a deep neural network model trained mature in the step S3, and outputting an identification result by the deep neural network model;
in step S1, the visible light image sample data is converted into point cloud image sample data, and the specific conversion method is as follows:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
Further, in the mapping in step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using the relationship mapping matrix R, and the relationship mapping matrix R is obtained by:
step S121: according to the requirement of covering the whole scene, adopting a plurality of groups of imaging systems to install an imaging sensing system; each imaging system comprises an optical imaging module and a laser scanning module;
because the spatial range covered by a single imaging module is 80 degrees by 60 degrees, a plurality of imaging modules are required to be installed in order to realize full scene coverage, each module has a position coordinate system, and the modules are mutually independent. Therefore, in this embodiment, the imaging sensing system includes at least 3 imaging systems, each imaging system includes at least 1 optical imaging module and 1 laser scanning module, the optical imaging module adopts a double-fisheye lens single sensor structure to collect panoramic images of a scene, the laser scanning module adopts a planar array laser TOF module, and the imaging systems work in parallel.
According to the requirement of covering the whole scene, a plurality of groups of imaging systems are adopted, each imaging system has a corresponding installation position, and the imaging systems are installed to form an imaging sensing system. In addition, the optical imaging modules in each imaging system have a corresponding coordinate system a, and the laser scanning modules in each imaging system have a corresponding target coordinate system b, except that each imaging system has a respective position coordinate system.
Step S122: the image acquired by the imaging sensing system installed in the step S121 is adopted to form a 3D depth image, and the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system are determined;
the imaging sensing system installed in the step S121 works, and the positions and the postures of the optical imaging module and the laser scanning module in each imaging system are adjusted, so that the images acquired by a plurality of imaging systems can finally form a 3D depth image covered by a full scene; and determining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system according to the current imaging conditions of the optical imaging module and the laser scanning module in each imaging system.
In determining the relative offset and rotation angle between the optical imaging module and the laser scanning module in each imaging system, the specific method is as follows:
step S122-1, an optical imaging module in each imaging system acquires a corresponding visible light image, and a laser scanning module in each imaging system acquires a corresponding laser scanning point cloud;
and step S122-2, calibrating according to the visible light image acquired by the optical imaging module in each imaging system and the laser scanning point cloud acquired by the laser scanning module, respectively, and obtaining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system.
Step S123: the relationship mapping matrix R between the optical imaging module and the laser scanning module in each imaging system is generated according to the relative offset and the rotation angle determined in step S122.
The generation method of the relation mapping matrix R comprises the following steps:
R=R x (α)R y (β)R z (θ);
wherein alpha is the rotation angle of the target coordinate system b relative to the coordinate system a around the X axis, beta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Y axis, theta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Z axis, the coordinate system a is the three-dimensional coordinate system for acquiring data by the optical imaging module, and the target coordinate system b is the three-dimensional coordinate system for acquiring data by the laser scanning module.
In addition, the optical imaging module and the laser scanning module are used for carrying out picture calibration on a 3D depth image model, the visible light 2D image formed by the optical imaging module and the 3D imaging image formed by the laser scanning module are subjected to region fusion, as the pixels of the two images are different, the pixels of the visible light 2D image are generally higher, the pixels are required to be correlated, the surrounding 3D imaging image information can be subjected to balanced amplification resolution after being correlated, in practical application, the 3D imaging image is usually subjected to difference amplification in matching, and the amplification can generate multiple pairs of pixels of the visible light 2D image and pixels of the 3D imaging image, so that the processing cannot influence post-processing.
And calibrating the overlapping area between the 3D depth images formed by each imaging system, and correspondingly performing multi-picture stitching through the characteristic points to realize a whole 360-degree panoramic 3D image, wherein the real-time file contains image information and corresponding depth information.
Example 2
A three-dimensional panoramic image scene structure identification system comprising the steps of:
the sample data acquisition module is used for acquiring scene video image data through the panoramic monitoring equipment, and converting scene element information in the video image sample data into scene element information in the point cloud image sample data to form tag data.
The deep neural network model building module is used for building the deep neural network model.
Common neural network models include: convolutional Neural Network (CNN), recurrent Neural Network (RNN), deep Neural Network (DNN), etc. The embodiment is based on feature recognition of point cloud data, so that a deep neural network model is built, and feature recognition is carried out on the point cloud image data through the built deep neural network model. In this embodiment, the structure of the deep neural network model is not required, and the structure of the existing deep neural network model is adopted.
The deep neural network model training module is used for training the deep neural network model built by the deep neural network model building module by adopting the point cloud image sample data and the label data in the sample data acquisition module.
In this embodiment, when the deep neural network model is trained, the training method is not required, and the existing training method is adopted.
The training method comprises the following steps:
1. and loading and processing data, creating a DataLoader, processing the whole data into a small block batch_size form by using the DataLoader class, performing iterative loop subsequently, and inputting the iterative loop into a model for training.
2、Define Neural Network
Define its own model and instantiate.
The Loss Function (mean square error cross entropy error, mini-batch learning), optimizer (SGD (random gradient descent method)) Momentum, adaGrad, RMSprop, adam used is defined.
3. Training, validation, testing (training, validation, test).
The real-time image recognition module is used for acquiring point cloud image data to be recognized and inputting the point cloud image data into the deep neural network model training module to train a mature deep neural network model, and the deep neural network model outputs a recognition result;
the sample data acquisition module converts visible light image sample data into point cloud image sample data, and the specific conversion method comprises the following steps:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
Further, in the mapping in step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using the relationship mapping matrix R, and the relationship mapping matrix R is obtained by:
step S121: according to the requirement of covering the whole scene, adopting a plurality of groups of imaging systems to install an imaging sensing system; each imaging system comprises an optical imaging module and a laser scanning module;
because the spatial range covered by a single imaging module is 80 degrees by 60 degrees, a plurality of imaging modules are required to be installed in order to realize full scene coverage, each module has a position coordinate system, and the modules are mutually independent. Therefore, in this embodiment, the imaging sensing system includes at least 3 imaging systems, each imaging system includes at least 1 optical imaging module and 1 laser scanning module, the optical imaging module adopts a double-fisheye lens single sensor structure to collect panoramic images of a scene, the laser scanning module adopts a planar array laser TOF module, and the imaging systems work in parallel.
According to the requirement of covering the whole scene, a plurality of groups of imaging systems are adopted, each imaging system has a corresponding installation position, and the imaging systems are installed to form an imaging sensing system. In addition, the optical imaging modules in each imaging system have a corresponding coordinate system a, and the laser scanning modules in each imaging system have a corresponding target coordinate system b, except that each imaging system has a respective position coordinate system.
Step S122: the image acquired by the imaging sensing system installed in the step S121 is adopted to form a 3D depth image, and the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system are determined;
the imaging sensing system installed in the step S121 works, and the positions and the postures of the optical imaging module and the laser scanning module in each imaging system are adjusted, so that the images acquired by a plurality of imaging systems can finally form a 3D depth image covered by a full scene; and determining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system according to the current imaging conditions of the optical imaging module and the laser scanning module in each imaging system.
In determining the relative offset and rotation angle between the optical imaging module and the laser scanning module in each imaging system, the specific method is as follows:
step S122-1, an optical imaging module in each imaging system acquires a corresponding visible light image, and a laser scanning module in each imaging system acquires a corresponding laser scanning point cloud;
and step S122-2, calibrating according to the visible light image acquired by the optical imaging module in each imaging system and the laser scanning point cloud acquired by the laser scanning module, respectively, and obtaining the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system.
Step S123: the relationship mapping matrix R between the optical imaging module and the laser scanning module in each imaging system is generated according to the relative offset and the rotation angle determined in step S122.
The generation method of the relation mapping matrix R comprises the following steps:
R=R x (α)R y (β)R z (θ);
wherein alpha is the rotation angle of the target coordinate system b relative to the coordinate system a around the X axis, beta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Y axis, theta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Z axis, the coordinate system a is the three-dimensional coordinate system for acquiring data by the optical imaging module, and the target coordinate system b is the three-dimensional coordinate system for acquiring data by the laser scanning module.
In addition, the optical imaging module and the laser scanning module are used for carrying out picture calibration on a 3D depth image model, the visible light 2D image formed by the optical imaging module and the 3D imaging image formed by the laser scanning module are subjected to region fusion, as the pixels of the two images are different, the pixels of the visible light 2D image are generally higher, the pixels are required to be correlated, the surrounding 3D imaging image information can be subjected to balanced amplification resolution after being correlated, in practical application, the 3D imaging image is usually subjected to difference amplification in matching, and the amplification can generate multiple pairs of pixels of the visible light 2D image and pixels of the 3D imaging image, so that the processing cannot influence post-processing.
And calibrating the overlapping area between the 3D depth images formed by each imaging system, and correspondingly performing multi-picture stitching through the characteristic points to realize a whole 360-degree panoramic 3D image, wherein the real-time file contains image information and corresponding depth information.
Example 3
The present embodiment provides a computer device including a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the three-dimensional panoramic image scene structure identification method described above.
The computer equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or D interface display memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the memory may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like. Of course, the memory may also include both internal storage units of the computer device and external storage devices. In this embodiment, the memory is often used to store an operating system and various application software installed on the computer device, for example, program codes of the three-dimensional panoramic image scene structure identification method, etc. In addition, the memory may be used to temporarily store various types of data that have been output or are to be output.
The processor may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to execute the program code stored in the memory or process data, for example, execute the program code of the three-dimensional panoramic image scene structure identification method.
Example 4
The present embodiment provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, causes the processor to execute the steps of the three-dimensional panoramic image scene structure identification method described above.
Wherein the computer-readable storage medium stores an interface display program executable by at least one processor to cause the at least one processor to perform the steps of the three-dimensional panoramic image scene structure identification method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal device (which may be a mobile phone, a computer, a server or a network device, etc.) to perform the three-dimensional panoramic image scene structure identification method according to the embodiment of the present application.
Claims (6)
1. The three-dimensional panoramic image scene structure identification method is characterized by comprising the following steps of:
step S1, obtaining sample data
Acquiring scene video image data through panoramic monitoring equipment, and converting scene element information in video image sample data into scene element information in point cloud image sample data to form tag data;
s2, building a deep neural network model
Building a deep neural network model;
step S3, training a deep neural network model
Training the deep neural network model constructed in the step S2 by adopting the point cloud image sample data and the label data in the step S1;
step S4, real-time image recognition
Acquiring point cloud image data to be identified, inputting the point cloud image data to be identified into a deep neural network model trained mature in the step S3, and outputting an identification result by the deep neural network model;
in step S1, the visible light image sample data is converted into point cloud image sample data, and the specific conversion method is as follows:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
2. The method for recognizing a three-dimensional panoramic image scene structure according to claim 1, wherein in the step S12, the coordinates in the coordinate system a are mapped into the target coordinate system b by using a relationship mapping matrix R, and the relationship mapping matrix R is obtained by:
step S121: according to the requirement of covering the whole scene, adopting a plurality of groups of imaging systems to install an imaging sensing system; each imaging system comprises an optical imaging module and a laser scanning module;
step S122: the image acquired by the imaging sensing system installed in the step S121 is adopted to form a 3D depth image, and the relative offset and the rotation angle between the optical imaging module and the laser scanning module in each imaging system are determined;
step S123: the relationship mapping matrix R between the optical imaging module and the laser scanning module in each imaging system is generated according to the relative offset and the rotation angle determined in step S122.
3. The method for identifying a three-dimensional panoramic image scene structure according to claim 2, wherein the method for generating the relationship mapping matrix R comprises the following steps:
R=R x (α)R y (β)R z (θ);
wherein alpha is the rotation angle of the target coordinate system b relative to the coordinate system a around the X axis, beta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Y axis, theta is the rotation angle of the target coordinate system b relative to the coordinate system a around the Z axis, the coordinate system a is the three-dimensional coordinate system for acquiring data by the optical imaging module, and the target coordinate system b is the three-dimensional coordinate system for acquiring data by the laser scanning module.
4. A three-dimensional panoramic image scene structure identification system, comprising:
the sample data acquisition module is used for acquiring scene video image data through the panoramic monitoring equipment, converting scene element information in the video image sample data into scene element information in the point cloud image sample data, and forming tag data;
the deep neural network model building module is used for building a deep neural network model;
the deep neural network model training module is used for training the deep neural network model built by the deep neural network model building module by adopting the point cloud image sample data and the label data in the sample data acquisition module;
the real-time image recognition module is used for acquiring point cloud image data to be recognized and inputting the point cloud image data into the deep neural network model training module to train a mature deep neural network model, and the deep neural network model outputs a recognition result;
the sample data acquisition module converts visible light image sample data into point cloud image sample data, and the specific conversion method comprises the following steps:
step S11, obtaining and identifying visible light image sample data to obtain coordinates of scene elements in the visible light image sample data, and collecting the coordinates of all the scene elements to obtain a coordinate set in a coordinate system a;
step S12, mapping each coordinate in the coordinate set in the coordinate system a in the step S11 into the coordinate system of the laser scanning module one by one to obtain a coordinate set in the target coordinate system b;
and S13, composing the coordinate set in the target coordinate system b in the step S12 into point cloud image sample data.
5. A computer device, characterized by: comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method according to any one of claims 1 to 3.
6. A computer-readable storage medium, characterized by: a computer program is stored which, when executed by a processor, causes the processor to perform the steps of the method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311022471.8A CN117058520A (en) | 2023-08-14 | 2023-08-14 | Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311022471.8A CN117058520A (en) | 2023-08-14 | 2023-08-14 | Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117058520A true CN117058520A (en) | 2023-11-14 |
Family
ID=88654724
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311022471.8A Pending CN117058520A (en) | 2023-08-14 | 2023-08-14 | Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117058520A (en) |
-
2023
- 2023-08-14 CN CN202311022471.8A patent/CN117058520A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111563923B (en) | Method for obtaining dense depth map and related device | |
US10924729B2 (en) | Method and device for calibration | |
CN110246163B (en) | Image processing method, image processing device, image processing apparatus, and computer storage medium | |
CN111476827B (en) | Target tracking method, system, electronic device and storage medium | |
US20200111234A1 (en) | Dual-view angle image calibration method and apparatus, storage medium and electronic device | |
CN110568447A (en) | Visual positioning method, device and computer readable medium | |
US11182945B2 (en) | Automatically generating an animatable object from various types of user input | |
JP7657308B2 (en) | Method, apparatus and system for generating a three-dimensional model of a scene - Patents.com | |
CN110260857A (en) | Calibration method, device and the storage medium of vision map | |
Luo et al. | A review of homography estimation: advances and challenges | |
CN114022560A (en) | Calibration method and related device and equipment | |
Shi et al. | An improved lightweight deep neural network with knowledge distillation for local feature extraction and visual localization using images and LiDAR point clouds | |
CN114863201B (en) | Training method, device, computer equipment and storage medium for three-dimensional detection model | |
McIlroy et al. | Kinectrack: 3d pose estimation using a projected dense dot pattern | |
CN113643328B (en) | Calibration object reconstruction method and device, electronic equipment and computer readable medium | |
Uma et al. | Marker based augmented reality food menu | |
JP2016038790A (en) | Image processing apparatus and image feature detection method, program and apparatus thereof | |
CN112016495A (en) | Face recognition method and device and electronic equipment | |
GB2557212A (en) | Methods and apparatuses for determining positions of multi-directional image capture apparatuses | |
CN117058520A (en) | Three-dimensional panoramic image scene structure identification method, system, equipment and storage medium | |
CN117332370A (en) | Underwater target acousto-optic panorama cooperative identification device and identification method | |
Gao et al. | Mc-nerf: Multi-camera neural radiance fields for multi-camera image acquisition systems | |
EP4350615A1 (en) | Facial deformation compensation method for facial depth image, and imaging apparatus and storage medium | |
CN117152244A (en) | Inter-screen relationship determination method and device, electronic equipment and storage medium | |
CN116819489A (en) | Dynamic object detection method, model training method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |