Disclosure of Invention
Aiming at the problem that in the prior art, the pile and the vehicle in the large-view-field metallurgical storage area are three-dimensionally rebuilt, and the acquired detection information is extremely limited by using fixed position detection, the invention provides a global three-dimensional rebuilding method and device for the metallurgical storage area based on depth vision.
In order to solve the technical problems, the invention provides the following technical scheme:
In one aspect, a depth vision-based metallurgical reservoir global three-dimensional reconstruction method is provided, comprising:
s1, acquiring original images of continuous frames and depth images corresponding to the original images through a depth camera to form an original image sequence and a depth image sequence;
s2, matching ORB characteristic points between two adjacent frames of original images to obtain initial matching point pairs, screening the initial matching point pairs to obtain characteristic point pair matching information and characteristic point pair pixel coordinates;
S3, extracting depth information of a depth image corresponding to the original image through a depth image sequence, and combining the matched characteristic point pair pixel coordinates and the corresponding depth information to obtain local space three-dimensional information;
s4, calculating a frame pose transformation matrix of the original image;
S5, repeatedly executing S1-S4 until all frame pose transformation matrixes in the original image sequence are calculated, and splicing the local space three-dimensional information acquired at different positions by combining the pose transformation matrixes to form global space three-dimensional information;
And S6, converting the coordinates of the global space three-dimensional information into a coordinate system of a crown block in a reservoir area through the coordinate system, and providing the coordinates for the crown block.
Optionally, in step S1, original images of continuous frames and depth images corresponding to the original images are acquired by a depth camera to form an original image sequence and a depth image sequence, and ORB characteristic points in the original images of two adjacent frames are detected respectively, wherein the ORB characteristic points comprise:
s11, moving along with the crown block through a depth camera arranged on the crown block, and collecting original images and depth images of continuous frames;
s12, selecting a current original image and a previous frame original image to extract ORB feature points, and obtaining pixel coordinates of the ORB feature points;
S13, taking ORB characteristic points as the center, selecting the size of a window as 13 pixel points, selecting 128 groups of comparison point pairs in the window, comparing the gray value of the center pixel with the gray value of the pixel points in the window, and calculating a binary code string of an ORB characteristic point descriptor through a formula (1):
wherein, I (p), I (q) represents the gray value of the central pixel and the pixels in the window, N p represents the pixel point contained in the neighborhood of the central pixel in the size range of the set window;
The binary string is marked as 0 if I (q) > I (p) and 1 if I (q) < I (p).
Optionally, in step S2, ORB feature points between two adjacent frames of original images are matched to obtain an initial matching point pair, the initial matching point pair is screened to obtain feature point pair matching information and feature point pair pixel coordinates, and the method comprises the following steps:
s21, matching ORB characteristic points detected in the current original image and the original image of the previous frame to form a matching point pair;
S22, substituting the initial matching point pair obtained in the S21 into a formula (2) through a random consistency algorithm RANSAC:
wherein, (x ', y') represents the initial matching point position in the current original image; (x, y) represents the initial matching point position of the original image of the previous frame;
determining an optimal transformation matrix meeting the most matching point pairs, and eliminating outliers;
S23, collecting position coordinate information of the crown block at different times and different images, calculating the movement direction of the crown block, eliminating characteristic point pairs, the relative positions of the current original image and the previous frame of original image of which do not accord with the movement trend of the crown block, and obtaining final characteristic point pair matching information and characteristic point pair pixel coordinates.
Optionally, in step S3, depth information of a depth image corresponding to the original image is extracted through a depth image sequence, and the matching feature point pair pixel coordinates and the corresponding depth information are combined to obtain local space three-dimensional information, which comprises the following steps:
s31, extracting a depth value D of a depth image corresponding to the original image through a depth image sequence;
S32, substituting the characteristic point pair pixel coordinates and the depth value D into a formula (3) according to the characteristic point pair pixel coordinates obtained in the step S2 and the depth value D, and calculating local space three-dimensional coordinates of the characteristic points to obtain local space three-dimensional information:
Wherein (u, v) represents the pixel coordinates of the feature points, C x,Cy represents the camera calibration parameters, and the position coordinates of the principal point of the image are respectively represented, f x,fy represents the camera calibration parameters, f x represents the component of the focal length along the X-axis direction under the image pixel coordinate system, f y represents the component of the focal length along the Y-axis direction under the image pixel coordinate system, and X, Y, D represent the calculated three-dimensional coordinates.
Optionally, in step S4, calculating a frame pose transformation matrix of the original image includes:
s41, constructing a space point set error term between the current original image and the previous frame original image, calculating a group of space transformation matrixes with minimum space position error through singular value decomposition SVD of a formula (4),
The method comprises the steps of p i, q i, E i, R, t and t, wherein p i represents a space three-dimensional point set obtained after characteristic point matching of a current original image, q i represents a space three-dimensional point set obtained after characteristic point matching of a previous original image, E i represents a space position error set of a characteristic point pair corresponding to a space transformation matrix [ R|t ];
S42, converting E i into a quaternion format for storage to obtain a frame pose transformation matrix.
Optionally, in step S5, the stitching is performed on the local spatial three-dimensional information acquired at different positions in combination with the pose transformation matrix to form global spatial three-dimensional information, including:
Combining the local space three-dimensional information under the images at different moments with a frame pose transformation matrix, and splicing the space three-dimensional information by using the space transformation matrix through a formula (5) to form global space three-dimensional information:
Pi=T*P' (5)
Wherein P i represents the set of spatial three-dimensional points in the current original image, P' i represents the set of spatial three-dimensional points in the previous frame image, and T represents the calculated frame pose transformation matrix.
Optionally, in step S6, the coordinates in the global space three-dimensional information are converted into the coordinates of the crown block in the reservoir area through the coordinate system, and are provided for the crown block, including:
And converting coordinates in the global space three-dimensional information into a reservoir area coordinate system and providing the reservoir area coordinate system for the crown block to use.
In one aspect, a metallurgical reservoir global three-dimensional reconstruction device based on depth vision is provided, the device is applied to the method of any one of the above items, and the device comprises:
the ORB detection module is used for acquiring original images of continuous frames through the depth camera and depth images corresponding to the original images to form an original image sequence and a depth image sequence;
The similarity matching module is used for matching ORB characteristic points between two adjacent frames of original images to obtain an initial matching point pair;
The local information calculation module is used for extracting depth information of a depth image corresponding to the original image through the depth image sequence, combining the matched characteristic point pair pixel coordinates and the corresponding depth information to obtain local space three-dimensional information;
the pose transformation calculation module is used for calculating a frame pose transformation matrix;
The global information splicing module is used for calculating all frame pose transformation matrixes in the original image sequence, splicing the local space three-dimensional information acquired at different positions by combining the pose transformation matrixes to form global space three-dimensional information;
The coordinate system conversion module is used for converting coordinates in the global space three-dimensional information into a coordinate system of the crown block in the reservoir area through the coordinate system and providing the coordinates for the crown block.
Optionally, the ORB detection module is used for acquiring original images and depth images of continuous frames by moving along with the crown block through a depth camera arranged on the crown block;
Selecting a current original image and a previous frame of original image to extract ORB feature points to obtain pixel coordinates of the ORB feature points;
After determining the pixel coordinate position of the ORB feature point, selecting the size of a window as 13 pixel points by taking the ORB feature point as the center, selecting 128 groups of comparison point pairs in the window, and calculating the binary code string of the ORB feature point descriptor by comparing the gray value of the center pixel with the gray value of the pixel point in the window.
Optionally, a similarity matching module is used for matching ORB characteristic points detected in the current original image and the previous frame original image to form a matching point pair;
selecting a matching point pair with a code exclusive OR value smaller than Yu Hanming distance percentage threshold in the binary code string, and setting the group of characteristic points as an initial matching point pair:
Determining an optimal transformation matrix meeting the most matching point pairs through a random consistency algorithm RANSAC, and eliminating outliers;
And acquiring position coordinate information of the crown block at different images and different moments, calculating the movement direction of the crown block, removing characteristic point pairs, the relative positions of the current original image and the previous frame of original image of which do not accord with the movement trend of the crown block, and obtaining final characteristic point pair matching information and characteristic point pair pixel coordinates.
The technical scheme provided by the embodiment of the invention has at least the following beneficial effects:
In the scheme, the global three-dimensional reconstruction method and the global three-dimensional reconstruction device for the metallurgical storage area, which are used for fusing the depth vision, acquire the original image and the depth image which cover the whole metallurgical storage area in real time in a surface scanning mode, acquire the global spatial three-dimensional information of the metallurgical storage area in a spatial three-dimensional information splicing mode, and therefore the effect of more comprehensively measuring the pile is achieved.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, the embodiment of the invention provides a metallurgical reservoir global three-dimensional reconstruction method based on depth vision, which comprises the following steps:
S1, acquiring original images of continuous frames and depth images corresponding to the original images through a depth camera to form an original image sequence and a depth image sequence, and respectively detecting ORB (Oriented FAST and Rotated BRIEF, rapid feature point extraction and description algorithm) feature points in the original images of two adjacent frames.
And S2, matching ORB characteristic points between two adjacent frames of original images to obtain initial matching point pairs, screening the initial matching point pairs to obtain characteristic point pair matching information and characteristic point pair pixel coordinates.
And S3, extracting depth information of a depth image corresponding to the original image through the depth image sequence, and combining the matched characteristic point pair pixel coordinates and the corresponding depth information to obtain the local space three-dimensional information.
S4, calculating a frame pose transformation matrix of the original image.
S5, repeatedly executing S1-S4 until all frame pose transformation matrixes in the original image sequence are calculated, and splicing the local space three-dimensional information acquired at different positions by combining the pose transformation matrixes to form global space three-dimensional information.
And S6, converting the coordinates of the global space three-dimensional information into a coordinate system of a crown block in a reservoir area through the coordinate system, and providing the coordinates for the crown block.
In the embodiment, by using the system and the method for reconstructing the global three-dimensional of the metallurgical storage area by fusion depth vision, the global spatial three-dimensional information of the metallurgical storage area can be acquired in a spatial three-dimensional information splicing mode, so that the pile is more completely measured.
As shown in fig. 2, in step S1, original images of consecutive frames and depth images corresponding to the original images are collected by a depth camera to form an original image sequence and a depth image sequence, and the detecting of ORB feature points in two adjacent original images includes:
S11, acquiring original images and depth images of continuous frames by moving along with the crown block through a depth camera arranged on the crown block, and forming an original image sequence and a depth image sequence.
And S12, selecting the current original image and the previous frame of original image to extract ORB feature points, and obtaining pixel coordinates of the ORB feature points.
S13, taking ORB characteristic points as the center, selecting the size of a window as 13 pixel points, selecting 128 groups of comparison point pairs in the window, comparing the gray value of the center pixel with the gray value of the pixel points in the window, and calculating a binary code string of an ORB characteristic point descriptor through a formula (1):
Wherein, I (p) represents the gray value of the central pixel, I (q) represents the gray value of the pixel in the window, N p represents the pixel point contained in the neighborhood of the central pixel in the size range of the set window;
The binary string is marked as 0 if I (q) > I (p) and 1 if I (q) < I (p).
In this embodiment, it is assumed that an area-array CCD camera that can acquire an original image and a depth image simultaneously is used as the camera.
The corresponding relation between the image pixel plane and the space three-dimensional information can be established by calibrating the camera parameters and integrating the depth image information, and in the embodiment, the internal reference matrix of the depth camera can be expressed as follows:
Wherein f x,fy is a camera calibration parameter, f x represents a component of the focal length along the X-axis direction in the image pixel coordinate system, f y represents a component of the focal length along the Y-axis direction in the image pixel coordinate system, and C x,Cy represents camera calibration parameters, respectively representing position coordinates of the principal point of the image.
In this embodiment, the internal reference calibration result of the depth camera is:
and for ORB feature point extraction, firstly selecting a current original image and a previous frame original image to extract ORB feature points, and obtaining feature point pixel coordinates (u, v).
In this embodiment, if 16 neighborhood pixel points q on a circle with radius of 3 are selected corresponding to the center pixel point p, if the neighborhood meets the condition, the neighborhood can be used as a feature point, and the judgment expression is as follows;
Where N represents the number of coincident pixels and lambda d represents the pixel difference threshold.
The number of coincident pixels in this embodiment is set to 12 and the pixel difference threshold is set to 15.
In this embodiment, by extracting the ORB feature points in each frame of image, the number of ORB feature points in each frame of image is shown in table 1:
TABLE 1
As shown in fig. 3, in step S2, the ORB feature points between two adjacent frames of original images are matched to obtain an initial matching point pair, and the filtering of the initial matching point pair to obtain feature point pair matching information and feature point pair pixel coordinates includes:
s21, matching ORB characteristic points detected in the current original image and the original image of the previous frame to form a matching point pair, and measuring the consistency of the matching point pair by using the Hamming distance percentage.
S22, substituting the initial matching point pair obtained in the S21 into a formula (2) through RANSAC (RANdom SAmple Consensus, random consistency algorithm):
Wherein (x ', y') represents the initial matching point position in the current original image, (x, y) represents the initial matching point position of the previous frame image, and s represents the scale parameter.
And determining an optimal transformation matrix meeting the most matching point pairs, and eliminating outliers.
S23, collecting position coordinate information of the crown block at different times and different images, calculating the movement direction of the crown block, eliminating characteristic point pairs, the relative positions of the current original image and the previous frame of original image of which do not accord with the movement trend of the crown block, and obtaining final characteristic point pair matching information and characteristic point pair pixel coordinates.
In this embodiment, the hamming distance percentage threshold is set to 0.6, and feature points in the current original image and the previous frame original image are matched, and the hamming distance percentage is used to measure the consistency of the feature points:
η=Hamming(Sl(x,y),Sr(x',y'))
if the descriptor coding exclusive OR value is smaller than a certain threshold value, judging the group of characteristic points as matching point pairs;
Wherein S l (x, y) represents the current original image ORB feature point descriptor, S r (x ', y') represents the previous frame image ORB feature point descriptor, and η represents the Hamming distance percentage threshold.
The number of pairs of feature points of the ORB initially matched by hamming distance in this embodiment is shown in table 2:
TABLE 2
In the embodiment, the initial matching point pairs are brought into a matrix in an iterative mode through RANSAC, an optimal transformation matrix meeting the maximum matching point pairs is determined, outliers are removed, and final characteristic point pair matching information and characteristic point pair pixel coordinates are obtained through motion trend constraint. In this embodiment, the result of the ORB feature point matching algorithm based on the motion trend constraint is shown in fig. 4.
The number of pairs of completely consistent ORB feature point matching points obtained after RANSAC random consistency constraint and motion trend constraint in this embodiment is shown in table 3:
TABLE 3 Table 3
Matching images |
Matching feature points (pair) |
Matching images |
Matching feature points (pair) |
1-2 |
22 |
11-12 |
10 |
2-3 |
20 |
12-13 |
18 |
3-4 |
22 |
13-14 |
20 |
4-5 |
30 |
14-15 |
18 |
5-6 |
32 |
15-16 |
41 |
6-7 |
11 |
16-17 |
54 |
7-8 |
9 |
17-18 |
85 |
8-9 |
6 |
18-19 |
99 |
9-10 |
10 |
19-20 |
180 |
10-11 |
8 |
|
|
As shown in fig. 5, in step S3, depth information of a depth image corresponding to an original image is extracted through a sequence of depth images, and local spatial three-dimensional information is obtained by combining the matched feature point pair pixel coordinates and the corresponding depth information, including:
and S31, extracting a depth value D of a depth image corresponding to the original image through the depth image sequence.
S32, substituting the characteristic point pair pixel coordinates and the depth value D into a formula (3) according to the characteristic point pair pixel coordinates obtained in the step S2 and the depth value D, and calculating local space three-dimensional coordinates of the characteristic points to obtain local space three-dimensional information:
Wherein (u, v) represents the pixel coordinates of the feature points, C x,Cy represents the camera calibration parameters, and the position coordinates of the principal point of the image are respectively represented, f x,fy represents the camera calibration parameters, f x represents the component of the focal length along the X-axis direction under the image pixel coordinate system, f y represents the component of the focal length along the Y-axis direction under the image pixel coordinate system, and X, Y, D represent the calculated three-dimensional coordinates.
In this embodiment, the spatial transformation matrix is converted into a quaternion form using the rodgers formula and stored.
The frame pose conversion values between the respective groups of images can be expressed as shown in table 4 using quaternions:
TABLE 4 Table 4
As shown in fig. 6, in step S4, a frame pose transformation matrix of the original image is calculated, including:
S41, constructing a space point set error term between the current original image and the previous frame original image, calculating a group of space transformation matrixes with minimum space position errors through SVD (Singular Value Decomposition ) of a formula (4),
The method comprises the steps of p i, q i, E i, R and t, wherein p i represents a space three-dimensional point set obtained after characteristic point matching of a current original image, q i represents a space three-dimensional point set obtained after characteristic point matching of a previous frame of original image, E i represents a space position error set of a characteristic point pair corresponding to a space transformation matrix [ R|t ], R represents a space rotation matrix for minimizing space position errors, and t represents the space transformation matrix.
S42, converting E i into a quaternion format for storage to obtain a frame pose transformation matrix.
In step S5, the stitching is performed on the local spatial three-dimensional information acquired at different positions in combination with the pose transformation matrix to form global spatial three-dimensional information, which includes:
Combining the local space three-dimensional information under the images at different moments with a frame pose transformation matrix, and splicing the space three-dimensional information by using the space transformation matrix through a formula (5) to form global space three-dimensional information:
Pi=T*P' (5)
Wherein P i represents the set of spatial three-dimensional points in the current original image, P i' represents the set of spatial three-dimensional points in the previous frame image, and T represents the calculated frame pose transformation matrix.
In this embodiment, the image acquisition resolution is 1280×720, the number of the maximum spatial three-dimensional points that can be acquired by a single image is 921600, and the corresponding actual coverage area is 10m×7m according to the height calculation of the installation position.
The control three-dimensional points in each group of images are spliced, and the space three-dimensional point information under each group of images is counted as shown in table 5:
TABLE 5
In this embodiment, the global point cloud image obtained by using the spatial three-dimensional information stitching is used to raise the detection coverage to 20m×7m.
In step S6, the global space three-dimensional information is converted into a coordinate system of a crown block in a reservoir area through the coordinate system and provided for the crown block, and the method includes:
the three-dimensional information of the global space is converted into the coordinate system of the reservoir area and is provided for the crown block to use.
In this embodiment, coordinate information of the same detection point of the visual detection system and the space detection system in the same scene is obtained, coordinate system fitting is performed, a coordinate conversion relation is calculated, and finally, a detection result based on depth vision is converted into a reservoir coordinate system and provided for an overhead travelling crane to perform operation.
In this embodiment, coordinate values of the same point in the respective coordinate systems of the first image and the base region coordinate system in the continuous image sequence acquired by the depth camera are obtained, as shown in table 6:
TABLE 6
In this embodiment, the spatial transformation matrix fitting result of the visual coordinate system and the reservoir coordinate system is:
The embodiment of the invention provides a metallurgical reservoir global three-dimensional reconstruction device 700 based on depth vision, wherein the device 700 is used for realizing the metallurgical reservoir global three-dimensional reconstruction method based on depth vision, and as shown in fig. 7, the device comprises:
The ORB detection module 701 is configured to collect original images of consecutive frames and depth images corresponding to the original images by using a depth camera, and form an original image sequence and a depth image sequence;
The similarity matching module 702 is configured to match ORB feature points between two adjacent frames of original images to obtain an initial matching point pair, and screen the initial matching point pair to obtain feature point pair matching information and feature point pair pixel coordinates.
The local information calculation module 703 is configured to extract depth information of a depth image corresponding to the original image through the depth image sequence, and combine the matched feature point pair pixel coordinates and the corresponding depth information to obtain local spatial three-dimensional information.
The pose transformation calculation module 704 is configured to calculate a frame pose transformation matrix.
The global information stitching module 705 is configured to calculate all frame pose transformation matrices in the original image sequence, and stitch the local spatial three-dimensional information acquired at different positions together with the pose transformation matrices to form global spatial three-dimensional information.
The coordinate system conversion module 706 is configured to convert the global spatial three-dimensional information into a coordinate system of the crown block in the reservoir area through the coordinate system, and provide the three-dimensional information to the crown block.
In this embodiment, the depth camera is mounted on the crown block and connected to the ground server through the ethernet, and the original image and the depth image covering the whole metallurgical storage area are collected in real time in a surface scanning manner along with the movement of the crown block.
The ORB detection module 701 is used for acquiring original images and depth images of continuous frames by moving along with the crown block through a depth camera arranged on the crown block;
Selecting a current original image and a previous frame of original image to extract ORB feature points to obtain pixel coordinates of the ORB feature points;
After determining the pixel coordinate position of the ORB feature point, selecting the size of a window as 13 pixel points by taking the ORB feature point as the center, selecting 128 groups of comparison point pairs in the window, and calculating the binary code string of the ORB feature point descriptor by comparing the gray value of the center pixel with the gray value of the pixel point in the window.
The similarity matching module 702 is configured to match the ORB feature points detected in the current original image and the previous original image to form a matching point pair;
selecting a matching point pair with a code exclusive OR value smaller than Yu Hanming distance percentage threshold in the binary code string, and setting the group of characteristic points as an initial matching point pair:
Determining an optimal transformation matrix meeting the most matching point pairs through a random consistency algorithm RANSAC, and eliminating outliers;
And acquiring position coordinate information of the crown block at different images and different moments, calculating the movement direction of the crown block, removing characteristic point pairs, the relative positions of the current original image and the previous frame of original image of which do not accord with the movement trend of the crown block, and obtaining final characteristic point pair matching information and characteristic point pair pixel coordinates.
According to the system and the method for reconstructing the global three-dimensional of the metallurgical storage area by using the fusion depth vision, the global spatial three-dimensional information of the metallurgical storage area can be acquired by using a spatial three-dimensional information splicing mode, so that the pile is more completely measured.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.