CN105184857B

CN105184857B - Monocular vision based on structure light ranging rebuilds mesoscale factor determination method

Info

Publication number: CN105184857B
Application number: CN201510580648.5A
Authority: CN
Inventors: 李秀智; 秦宝岭; 贾松敏; 杨爱林
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2015-09-13
Filing date: 2015-09-13
Publication date: 2018-05-25
Anticipated expiration: 2035-09-13
Also published as: CN105184857A

Abstract

A method for determining the scale factor in monocular vision reconstruction based on point structured light ranging, the method includes spot centroid center positioning, spatial straight line fitting, RANSAC exclusion, calculating the three-dimensional space point coordinates of the spot, and calculating the scale factor. Aiming at the traditional three-dimensional reconstruction method of monocular vision based on image sequence, most of them can only realize three-dimensional reconstruction in projective scale or affine scale. , so that the scale of the reconstructed 3D scene using the image sequence is consistent with that of the real world scene. The technical features of the present invention are as follows: (1) monocular reconstruction introduces structured light active vision to realize Euclidean three-dimensional reconstruction, (2) center-of-mass method spot positioning, (3) laser ray equation fitting with RANSAC elimination mechanism, (4) reflection Projection-optimized point positioning of spot space, (5) Euclidean reconstruction of various methods of monocular reconstruction is not limited.

Description

Scale Factor Determination Method for Monocular Vision Reconstruction Based on Point Structured Light Ranging

技术领域technical field

本发明属于计算机视觉领域，涉及利用点结构光实现基于图像序列的欧氏三维重建的方法。The invention belongs to the field of computer vision and relates to a method for realizing Euclidean three-dimensional reconstruction based on image sequence by using point structured light.

背景技术Background technique

视觉是人类最重要的感知手段,大约有80％的外界信息是通过眼睛被人接收的。正是因为视觉对人类的重要性,随着数字计算机的飞速发展,让计算机也具有视觉,能够处理视觉信息就成了一项非常诱人的研究课题。这样,就导致了计算机视觉这一学科的产生和发展。Vision is the most important means of human perception, about 80% of the external information is received by people through the eyes. It is precisely because of the importance of vision to human beings that with the rapid development of digital computers, it has become a very attractive research topic to make computers have vision and be able to process visual information. In this way, it leads to the emergence and development of the subject of computer vision.

计算机视觉的研究中一个很重要的部分就是对获得的动态图像序列进行分析处理,以得到有用信息。动态图像是针对运动的物体或景物而言的,他们不仅是空间位置的函数,而且是随时间变化的,它为我们提供了比单一图像更丰富的信息。在对某一景物拍摄到的图像序列中,相邻两帧图像间至少有一部分像素的灰度及色彩发生了变化,这个图像序列就称之为动态图像序列。A very important part of computer vision research is to analyze and process the obtained dynamic image sequence to obtain useful information. Dynamic images are aimed at moving objects or scenes, they are not only a function of spatial position, but also change with time, it provides us with richer information than a single image. In the image sequence captured by a certain scene, the grayscale and color of at least a part of the pixels between two adjacent frames of images have changed, and this image sequence is called a dynamic image sequence.

近年来随着计算机视觉领域研究的迅速发展，利用图像序列中的二维信息来恢复真实世界三维结构的问题也是其中热点问题和重要研究方向之一。不管国内还是国际上相关领域的研究人员针对这些问题已经提出了一些较为有效的解决方法。其中,基于图像序列的单目视觉三维重建方法以其重建约束限制条件少、重建所需预知的信息量少、适宜对大尺度场景进行重建的优点,成为了解决该问题的一类最主要的方法。在多视图几何中一个重要的理论模式就是层次理论，它定义了一个关于从真实场景到其重建模型的等级层次，层次理论涉及的变换主要有射影变换、仿射变换、相似变换和欧氏变换。然而,传统的基于图像序列的单目视觉三维重建方法大多只能实现射影尺度或仿射尺度下的三维重建,即重建后的结果与现实世界场景尺度相差一个尺度因子,这样就会限制其在现实中的应用。针对于此,本发明提出一种利用点结构光作为辅助来实现单目视觉的欧氏三维重建方法,使得利用图像序列重建后的三维场景与现实世界场景的尺度保持一致,并与此同时提高三维重建算法的鲁棒性和精确性。In recent years, with the rapid development of research in the field of computer vision, the problem of using two-dimensional information in image sequences to restore the three-dimensional structure of the real world is also one of the hot issues and one of the important research directions. Researchers in related fields both domestically and internationally have proposed some effective solutions to these problems. Among them, the monocular vision 3D reconstruction method based on image sequences has become the most important method to solve this problem due to its advantages of less reconstruction constraints, less predictable information required for reconstruction, and suitable for reconstruction of large-scale scenes. method. An important theoretical model in multi-view geometry is the hierarchy theory, which defines a hierarchy from the real scene to its reconstruction model. The transformations involved in the hierarchy theory mainly include projective transformation, affine transformation, similarity transformation and Euclidean transformation. . However, most of the traditional monocular vision 3D reconstruction methods based on image sequences can only achieve 3D reconstruction at the projective scale or affine scale, that is, the reconstructed result differs from the scene scale of the real world by a scale factor. application in reality. In view of this, the present invention proposes a Euclidean three-dimensional reconstruction method using point structured light as an auxiliary to realize monocular vision, so that the scale of the three-dimensional scene reconstructed by using image sequences is consistent with that of the real world scene, and at the same time improves the Robustness and accuracy of 3D reconstruction algorithms.

根据光学投射器所投射的光束模式的不同，结构光模式又可以分为点结构光模式、线结构光模式、多线结构光模式以及网格结构光模式等。单纯的利用单目视觉得到的三维场景无法达到欧氏三维场景真实重建效果，本发明利用点结构光和单目相机实现欧氏三维重建。激光器发出的光束投射到物体表面上产生一个光点，光点经摄像机的镜头成像在摄像机的像平面上，形成一个二维像点。摄像机的视线和光束线在空间中于光点处相交，并由其可以唯一确定光点在某一已知世界坐标系中的空间位置，进而求得空间尺度因子达到欧式三维重建的效果。According to the beam pattern projected by the optical projector, the structured light mode can be divided into point structured light mode, line structured light mode, multi-line structured light mode and grid structured light mode, etc. The three-dimensional scene obtained by simply using monocular vision cannot achieve the real reconstruction effect of the Euclidean three-dimensional scene. The present invention uses point structured light and a monocular camera to realize the Euclidean three-dimensional reconstruction. The beam emitted by the laser is projected onto the surface of the object to generate a light spot, which is imaged on the image plane of the camera through the lens of the camera to form a two-dimensional image point. The line of sight of the camera and the beam line intersect at the light point in space, and the spatial position of the light point in a known world coordinate system can be uniquely determined by it, and then the spatial scale factor can be obtained to achieve the effect of European 3D reconstruction.

本发明通过设计以点结构光为辅助的单目立体视觉对真实场景实现真实尺度的欧氏三维重建。原理图如附图1所示。首先，固定靶标(8×11的黑白棋盘格)和激光器，将靶标放在相机视野范围内，拍摄一张图片(图中如靶标1位置所示)，然后打开激光器将激光打到靶标上再拍一张图片(图中如靶标2位置所示)，移动靶标重复多次上面的操作获得多幅图像(靶标3,4,5等)，对输入的图像进行预处理。然后，利用图像帧差法获取激光点光斑，利用质心法求取图像中光斑质心的坐标。根据靶标与图像之间的单应关系获得摄像机坐标系下所有光斑的空间点三维坐标，根据多个光斑空间点三维坐标拟合激光器投出的射线方程l₁，最后，从相机光心发出的经过光斑质心的射线l₂与激光器投出的射线的交点(由于误差存在，l₁和l₂会成为异面直线，实际应用中使用异面直线公垂线中点)即为真实的光斑空间三维点，这样即可求出尺度因子的大小。The present invention realizes real-scale Euclidean three-dimensional reconstruction of real scenes by designing monocular stereo vision aided by point structured light. The principle diagram is shown in Figure 1. First, fix the target (8×11 black and white checkerboard) and the laser, place the target within the field of view of the camera, take a picture (as shown in the position of target 1 in the figure), then turn on the laser to hit the laser on the target and then Take a picture (as shown in the position of target 2 in the figure), move the target and repeat the above operation several times to obtain multiple images (target 3, 4, 5, etc.), and preprocess the input image. Then, the image frame difference method is used to obtain the laser point spot, and the centroid method is used to obtain the coordinates of the center of mass of the spot in the image. According to the homography between the target and the image, the three-dimensional coordinates of all the spots in the camera coordinate system are obtained, and the ray equation l ₁ emitted by the laser is fitted according to the three-dimensional coordinates of the multiple spots in the space. Finally, the light emitted from the optical center of the camera The intersection point of the ray l ₂ passing through the center of the spot and the ray projected by the laser (due to the existence of errors, l ₁ and l ₂ will become straight lines of different planes, and the midpoint of the common vertical line of the straight lines of different planes is used in practical applications) is the real spot space 3D points, so that the size of the scale factor can be obtained.

发明内容Contents of the invention

本发明从点结构光出发,提出一种以点结构光为辅助的单目视觉欧氏三维重建方法,包括光斑质心中心定位、加入RANSAC排异的空间直线拟合、求取光斑三维空间点坐标、尺度因子的求取。Starting from point structured light, the present invention proposes a monocular Euclidean 3D reconstruction method assisted by point structured light, including spot centroid center positioning, spatial straight line fitting with RANSAC rejection, and calculation of spot coordinates in 3D space , Finding the scale factor.

本发明采用的技术方案为基于点结构光测距的单目视觉重建中尺度因子确定方法，尺度因子求取的整体流程图如附图2所示。整个标定均统一到摄像机坐标系上。摄像机内部参数使用基于2D平面靶标的摄像机标定方法进行标定，在标定结构光系统参数时，保持相机和激光器的空间位置不发生变化。然后将激光光斑投射于棋盘靶标所处的平面上，拍下此时的图像，然后关掉激光器再拍下此时的图像，按照此步骤重复将靶标摆放任意次获得一组数据图像，根据背景差分法可获得每个位置的靶标对应的光斑信息，如附图3所示。利用靶标上的角点信息得到每次摆放靶标平面的外参数矩阵，再加上每次标定图像中激光光斑的质心坐标信息，就得到每次摆放靶标时激光光斑质心的空间坐标。这样将多次摆放求得的激光光斑质心坐标的结果进行Levenberg-Marquardt拟合，即可得到激光光线的空间方程。通过标定好的敏感器内参数信息得到图像平面上激光光斑的空间坐标。理论上激光光线与相机光心和激光光斑在图像上的像点坐标直线会完全重合，但是，由于标定和计算的误差，这两条直线会形成异面直线，设定两条异面直线的公垂线中点为激光亮点的真实空间坐标，真实尺度因子即利用点结构光标定得到光斑的真实空间坐标与运用三维重建算法得到的光斑空间坐标之比，进而达到欧氏重建效果。The technical solution adopted in the present invention is a method for determining the scale factor in monocular vision reconstruction based on point structured light ranging, and the overall flow chart for calculating the scale factor is shown in Figure 2. The whole calibration is unified to the camera coordinate system. The internal parameters of the camera are calibrated using a camera calibration method based on a 2D plane target. When calibrating the parameters of the structured light system, the spatial positions of the camera and laser are kept unchanged. Then project the laser spot on the plane where the checkerboard target is located, take the image at this time, then turn off the laser and take the image at this time, repeat this step to place the target any number of times to obtain a set of data images, according to The background difference method can obtain the spot information corresponding to the target at each position, as shown in Figure 3. Using the corner point information on the target to obtain the external parameter matrix of the target plane each time, plus the coordinate information of the center of mass of the laser spot in each calibration image, the spatial coordinates of the center of mass of the laser spot are obtained each time the target is placed. In this way, the result of Levenberg-Marquardt fitting of the coordinates of the center of mass of the laser spot obtained by multiple placements can be used to obtain the spatial equation of the laser light. The spatial coordinates of the laser spot on the image plane are obtained through the calibrated sensor internal parameter information. Theoretically, the laser light and the camera optical center and the image point coordinate line of the laser spot on the image will completely coincide. However, due to calibration and calculation errors, these two straight lines will form a straight line of different planes. The midpoint of the common vertical line is the real space coordinate of the laser spot, and the real scale factor is the ratio of the real space coordinate of the spot obtained by using the point structure cursor to the space coordinate of the spot obtained by using the three-dimensional reconstruction algorithm, thereby achieving the Euclidean reconstruction effect.

(1)质心法光斑定位(1) Centroid method spot positioning

光斑图像是现有图像处理中较为常见的图像，光斑质心是光斑图像的重要特征之一。在视觉测量、卫星导航等诸多领域中，如何实现光斑中心的快速精确定位是现如今国内外研究的一个重要课题。The spot image is a relatively common image in existing image processing, and the spot centroid is one of the important features of the spot image. In many fields such as visual measurement and satellite navigation, how to realize the rapid and accurate positioning of the spot center is an important topic of research at home and abroad.

点状光斑的中心定位方法可分为基于灰度和基于边缘的两大类。基于灰度的方法一般利用目标的灰度分布信息，适用于半径较小且灰度分布均匀的光斑；基于边缘的方法一般利用目标的边缘形状信息，适用于半径较大的光斑。因此，小尺寸光斑通常采用基于灰度的方法进行中心定位。目前常用的基于灰度的中心定位方法包括三种：即质心法、Hessian矩阵法和高斯拟合法。质心法是用的最多的一种细分定位方法，它实现比较简单，运算速度快，而且有一定的定位精度，因此采用质心光斑定位法，具体实现方法在下文会有详细阐述。The center localization methods of point-like spots can be divided into two categories based on grayscale and edge-based. The grayscale-based method generally uses the grayscale distribution information of the target, and is suitable for spots with a small radius and uniform grayscale distribution; the edge-based method generally uses the edge shape information of the target, and is suitable for spots with a large radius. Therefore, small-sized spots are usually centered using grayscale-based methods. At present, there are three commonly used gray-based center positioning methods: the centroid method, the Hessian matrix method, and the Gaussian fitting method. The centroid method is the most commonly used subdivision positioning method. It is relatively simple to implement, fast in operation speed, and has a certain positioning accuracy. Therefore, the centroid spot positioning method is used. The specific implementation method will be described in detail below.

(2)加入RANSAC剔除机制的激光器射线方程拟合(2) Laser ray equation fitting with RANSAC elimination mechanism

通过以上步骤可以获得多组质心光斑坐标，然后进行空间点激光器射线方程的拟合，本发明利用OpenCV中的空间直线拟合函数，能够较准确且方便的获得射线方程。然而，由于在求取质心坐标等步骤的误差，获得的光斑质心坐标可能会存在较大的误差，这样会使得拟合出来的空间直线方程与真实的直线方程存在较大误差，为了减小误差使求得的空间直线方程尽可能的准确，本发明提出使用RANSAC排异方法，剔除误差较大的空间点，结果如附图4所示，在具体实施方式中将详细介绍。Multiple sets of centroid spot coordinates can be obtained through the above steps, and then the ray equation of the spatial point laser can be fitted. The present invention utilizes the space straight line fitting function in OpenCV to obtain the ray equation more accurately and conveniently. However, due to errors in steps such as obtaining the coordinates of the center of mass, there may be a large error in the coordinates of the center of mass of the obtained spot, which will cause a large error between the fitted space linear equation and the real linear equation. In order to reduce the error To make the obtained spatial linear equation as accurate as possible, the present invention proposes to use the RANSAC exclusion method to eliminate spatial points with large errors. The result is shown in Figure 4, which will be described in detail in the specific implementation.

(3)反投影优化的光斑空间点定位(3) Spatial point positioning of spot optimized by back projection

在根据以上步骤求得激光器射线方程后，摄像机的视线和光束线在空间中于光点处相交,然而，由于图像平面坐标的测量误差、噪声以及相机畸变等因素的影响，摄像机视线和光束线并不能完全交于一点，所以交汇定位问题就是求异面直线公垂线段中点的坐标。本发明采用将异面直线公垂线段中点算法求得的空间三维点作为图像光斑对应的空间点三维坐标，本文进一步提出利用反投影迭代法来优化求得的空间三维点坐标，具体实现方法在下文会有详细阐述。After obtaining the laser ray equation according to the above steps, the line of sight of the camera and the beam line intersect at the light point in space. They cannot be completely intersected at one point, so the intersection location problem is to find the coordinates of the midpoint of the common perpendicular segment of straight lines on different planes. In the present invention, the three-dimensional point in space obtained by the algorithm of the midpoint of the common vertical segment of straight lines on different planes is used as the three-dimensional coordinates of the space point corresponding to the image spot. This paper further proposes to use the back-projection iterative method to optimize the obtained three-dimensional point coordinates in space, and the specific implementation method It will be explained in detail below.

与现有技术相比,本发明具有如下有益效果Compared with the prior art, the present invention has the following beneficial effects

(1)单目重建引入结构光主动视觉实现欧氏三维重建(1) Monocular reconstruction introduces structured light active vision to realize Euclidean 3D reconstruction

本发明并不是传统的点结构光重建，而是在运用单目视觉实现三维重建的基础上通过点结构光引入真实尺度因子,从而实现真实尺度的欧氏三维重建。首先，运用基于光流反馈的单目视觉三维重建算法实现对环境快速准确的立体化建模，然而，图像序列重建后的三维场景与真实的三维场景之间缺少一个真实尺度因子，没有达到欧氏三维重建的效果，因此，求取真实尺度因子的方法就是利用点结构光标定得到的光斑的真实空间坐标标记为[X,Y,Z]与运用三维重建算法得到的光斑的空间坐标标记为[X₀,Y₀,Z₀]之比作为尺度因子λ，即λ＝[X,Y,Z]/[X₀,Y₀,Z₀]这样即可实现单目视觉的欧氏三维重建。The present invention is not traditional point structured light reconstruction, but on the basis of using monocular vision to realize three-dimensional reconstruction, the real scale factor is introduced through point structured light, so as to realize real scale Euclidean three-dimensional reconstruction. First of all, the monocular 3D reconstruction algorithm based on optical flow feedback is used to achieve fast and accurate three-dimensional modeling of the environment. However, there is a lack of a real scale factor between the 3D scene reconstructed from the image sequence and the real 3D scene. Therefore, the method to obtain the real scale factor is to use the point structure cursor calibration to obtain the real space coordinates of the spot marked as [X, Y, Z] and the space coordinates of the spot obtained by using the 3D reconstruction algorithm to be marked as The ratio of [X ₀ , Y ₀ , Z ₀ ] is used as the scale factor λ, that is, λ=[X,Y,Z]/[X ₀ ,Y ₀ ,Z ₀ ] so that the Euclidean three-dimensional reconstruction of monocular vision can be realized .

(2)单目重建各种方法的欧氏重建不受限制(2) Euclidean reconstruction of various methods of monocular reconstruction is not limited

基于多视图像的场景重建一直以来都是计算机视觉领域的研究热点，多视立体重建是利用在不同位置拍摄的关于某场景的一系列图像来恢复真实的三维场景。如今，主流的致密重建方法包括基于体素的方法，基于多边形网格变形的方法，基于多视深度图的方法，以及基于面片扩展的方法等。然而，以上方法重建的结果与真实场景的重建结果相差一个尺度因子，无法实现真实场景的欧氏重建，针对于此本发明提出的基于点结构光的单目视觉欧氏三维重建方法均不受重建方法的限制，都能够实现欧氏三维重建。只要根据点结构光确定激光器投射到靶标上光斑的真实空间点坐标，并且确定图像光斑坐标与真实空间点坐标的对应关系就能求得尺度因子，所以在使用其他方法的三维重建过程中不会受到重建方法的限制，都能够实现欧氏三维重建。Scene reconstruction based on multi-view images has always been a research hotspot in the field of computer vision. Multi-view stereo reconstruction uses a series of images of a scene taken at different locations to restore the real 3D scene. Today, the mainstream dense reconstruction methods include voxel-based methods, methods based on polygonal mesh deformation, methods based on multi-view depth maps, and methods based on patch expansion. However, the reconstruction result of the above method is different from the reconstruction result of the real scene by a scale factor, and the Euclidean reconstruction of the real scene cannot be realized. For this, the monocular vision Euclidean three-dimensional reconstruction method based on point structured light proposed by the present invention is not subject to Reconstruction methods are limited and are able to achieve Euclidean 3D reconstruction. As long as the real space point coordinates of the laser spot projected on the target are determined according to the point structured light, and the corresponding relationship between the image spot coordinates and the real space point coordinates is determined, the scale factor can be obtained, so it will not be used in the 3D reconstruction process using other methods. Restricted by the reconstruction method, Euclidean 3D reconstruction can be realized.

附图说明Description of drawings

图1激光器射线求取原理图。Fig. 1 Schematic diagram of obtaining laser rays.

图2整体流程图。Figure 2 overall flow chart.

图3光斑质心求取。Figure 3 Calculation of spot centroid.

图4加入RANSAC直线拟合。Figure 4 adds RANSAC straight line fitting.

具体实施方式Detailed ways

结合附图对本发明作进一步的详细说明。The present invention will be described in further detail in conjunction with the accompanying drawings.

本发明具体包括以下几个步骤。The present invention specifically includes the following steps.

(1)点状光斑的中心定位(1) Center positioning of spot light spot

先通过各种滤波或阈值选取方式对图像进行预处理，然后再对图像光斑进行质心定位。对整个图像像素处理结束后，根据各个光斑最终质心参数组累加值，按照如下的一阶矩质心计算公式计算其质心行列坐标：The image is preprocessed through various filtering or threshold selection methods, and then the centroid of the image spot is located. After processing the entire image pixel, according to the cumulative value of the final centroid parameter group of each spot, the row and column coordinates of the centroid are calculated according to the following first-order moment centroid calculation formula:

上式中I(x,y)表示输入图像像素的灰度值，x,y是该像素对应的行列坐标，X_c,Y_c分别为光斑质心的行列坐标。In the above formula, I(x, y) represents the gray value of the input image pixel, x, y are the row and column coordinates corresponding to the pixel, and X _c , Y _c are the row and column coordinates of the centroid of the light spot respectively.

RANSAC通过反复选择数据中的一组随机子集来达成目标。被选取的子集被假设为局内点，并用下述方法进行验证，其模型描述如下：RANSAC achieves its goal by iteratively selecting a set of random subsets in the data. The selected subset is assumed to be the interior point, and is verified by the following method, and the model description is as follows:

1)从所有光斑质心I中选取n个点作为内点，根据最小二乘法利用n个点可以估计空间直线方程L:Ax+By+Cz+D＝0，即方程的所有的未知参数(A,B,C,D)都能从假设的局内点计算得出。1) Select n points from all spot centroids I as interior points, and use n points according to the least squares method to estimate the space linear equation L: Ax+By+Cz+D=0, that is, all unknown parameters of the equation (A ,B,C,D) can be calculated from the hypothetical interior points.

2)用1)中得到的模型去测试所有的其它剩余的光斑质心(其光斑质心个数为I-n)，如果某个光斑质心点适用于估计的空间直线模型L，即如果某个光斑质心坐标(x’,y’,z’)满足Ax’+By’+Cz’+D<|σ|,认为此质心点也是局内点，其中σ为设定的阈值。2) Use the model obtained in 1) to test all other remaining spot centroids (the number of spot centroids is I-n), if a certain spot centroid point is suitable for the estimated space linear model L, that is, if a certain spot centroid coordinate (x', y', z') satisfy Ax'+By'+Cz'+D<|σ|, and consider this centroid point to be an intra-local point, where σ is the set threshold.

3)如果有足够多的光斑质心点被归类为假设的局内点，即有足够多的光斑质心点(假设有m个点满足)满足Ax’+By’+Cz’+D<|σ|,，那么估计的模型L就足够合理。3) If there are enough spot centroid points classified as hypothetical interior points, that is, there are enough spot centroid points (assuming m points are satisfied) to satisfy Ax'+By'+Cz'+D<|σ| , then the estimated model L is reasonable enough.

4)然后，用m个满足条件的局内点去重新估计模型，因为它仅仅被初始的假设局内点估计过。4) Then, re-estimate the model with m inlier points satisfying the condition, because it has only been estimated by the initial hypothetical inlier points.

5)最后，通过估计局内点与模型的错误率来评估模型。5) Finally, the model is evaluated by estimating the error rate of the inlier points and the model.

这个过程被重复执行固定的次数t次，每次产生的模型要么因为局内点太少而被舍弃，要么因为比现有的模型更好而被选用。This process is repeated for a fixed number of t times, and the model generated each time is either discarded because there are too few intra-site points, or selected because it is better than the existing model.

(3)求取光斑三维空间点坐标(3) Calculate the coordinates of the three-dimensional space point of the spot

然而由相机光心发出经过光斑质心的射线与激光器投出的射线的交点由于误差原因并不能完全交于一点，因此，采用异面直线公垂线中点算法，实际应用中使用这个线间距的线段中点作为激光亮点的空间坐标，这样即可求得图像中光斑质心的图像坐标对应的空间点三维坐标。具体数学模型如下:通过相机光心发出的射线O₁P₁方向向量为V₁,相机坐标系下拟合的激光器发出光线O₂P₂的方程方向向量表示为V₂,则其公垂线的方向向量为V＝V₁×V₂,设公垂线与O₁P₁、O₂P₂的交点分别为M₁(x₁,y₁,z₁)和M₂(x₂,y₂,z₂)，则激光光斑的空间三维点坐标为M(x,y,z)，其中x＝(x₁+x₂)/2，y＝(y₁+y₂)/2，z＝(z₁+z₂)/2。However, the intersection point of the ray from the optical center of the camera passing through the center of mass of the spot and the ray from the laser cannot completely intersect at one point due to errors. The midpoint of the line segment is used as the spatial coordinates of the laser bright spot, so that the three-dimensional coordinates of the spatial point corresponding to the image coordinates of the spot centroid in the image can be obtained. The specific mathematical model is as follows: the direction vector of the ray O ₁ P ₁ emitted through the optical center of the camera is V ₁ , and the equation direction vector of the light O ₂ P ₂ emitted by the laser fitted in the camera coordinate system is expressed as V ₂ , then its common perpendicular The direction vector of is V=V ₁ ×V ₂ , and the intersection points of the common vertical line and O ₁ P ₁ , O ₂ P ₂ are respectively M ₁ (x ₁ ,y ₁ ,z ₁ ) and M ₂ (x ₂ ,y ₂ , z ₂ ), then the spatial three-dimensional point coordinates of the laser spot are M(x, y, z), where x=(x ₁ +x ₂ )/2, y=(y ₁ +y ₂ )/2, z =(z ₁ +z ₂ )/2.

本文进一步提出利用反投影迭代法来优化求得的空间三维点坐标，具体方法如下：首先根据异面直线公垂线中点算法求得光斑质心的图像坐标[u,v]对应的空间点三维坐标[x,y,z]后，将此空间点反投影到图像，即求取空间三维点与相机光心连线与图像坐标的交点[u’,v’]，这样可以得到取光斑质心与反投影后图像坐标的中点[(u+u’)/2,(v+v’)/2],并将此作为光斑质心，根据异面直线公垂线中点算法求取空间三维点坐标，不断重复上述过程，计算空间点三维坐标M(x^*,y^*,z^*)与激光器投出直线l(空间直线方程为Ax+By+Cz+D＝0)的距离δ，计算公式为若δ小于某个阈值，则至此迭代结束。This paper further proposes to use the back-projection iterative method to optimize the obtained spatial three-dimensional point coordinates. The specific method is as follows: firstly, the three-dimensional spatial point corresponding to the image coordinates [u, v] of the center of mass of the spot is obtained according to the midpoint algorithm of the common vertical line of the straight line of different planes. After the coordinates [x, y, z], back-project this space point to the image, that is, find the intersection point [u', v'] of the line connecting the three-dimensional point in space and the optical center of the camera and the image coordinates, so that the center of mass of the spot can be obtained With the midpoint of the image coordinates after back projection [(u+u')/2, (v+v')/2], and use this as the spot centroid, calculate the three-dimensional space according to the midpoint algorithm of the common vertical line of different planes Point coordinates, repeat the above process continuously, calculate the distance δ between the three-dimensional coordinates M (x ^* , y ^* , z ^* ) of the space point and the straight line l projected by the laser (the equation of the straight line in space is Ax+By+Cz+D=0), and calculate The formula is If δ is less than a certain threshold, the iteration ends so far.

(4)实现欧氏三维重建(4) Realize Euclidean 3D reconstruction

致密重建方法包括基于体素的方法，基于多边形网格变形的方法，基于多视深度图的方法，以及基于面片扩展的方法。其中前三种方法仅适用于单一三维实体的重构，因而不能满足机器人导航、虚拟现实等技术需求。面片扩展法精度较高，且能对建筑物等大场景进行重建。但面片之间的连接性，即重建的完整性难以保证。Dense reconstruction methods include voxel-based methods, polygon mesh deformation-based methods, multi-view depth map-based methods, and patch expansion-based methods. Among them, the first three methods are only suitable for the reconstruction of a single 3D entity, so they cannot meet the technical requirements of robot navigation and virtual reality. The patch expansion method has high precision and can reconstruct large scenes such as buildings. However, the connectivity between patches, that is, the integrity of the reconstruction is difficult to guarantee.

针对上述问题，本文采用基于场景流反馈驱动的单目视觉大场景三维重建方法，具体步骤如下：手持自由移动式相机采集目标场景的多视图像，由相邻帧间光流场建立贯穿多视的像素匹配关系，继而利用五点算法求解多视间的欧氏空间转换关系。选取中央视图为参考帧，建立世界坐标系(O_w-X_wY_wZ_w)，求解相应像点对应的稀疏空间三维坐标。在稀疏重构的基础上生成原始网格面片，并反馈至各比较帧视角，由光流场定量评价反馈误差，并用各视图像的偏差驱动模型变形。由于光流矢量场蕴含了空间物体的运动矢量场信息，因而借助光流-场景流分析的方法，能够有效修正原始的多边形网格，在得到经过光流-场景流调整后的场景后，利用以上步骤得到的空间尺度因子，即可得到三维场景的欧氏重建。Aiming at the above problems, this paper adopts a 3D reconstruction method of large monocular vision scenes driven by scene flow feedback. The specific steps are as follows: a hand-held free-moving camera collects multi-view images of the target scene, and the optical flow field between adjacent frames is used to establish a multi-view reconstruction method. The pixel matching relationship, and then use the five-point algorithm to solve the Euclidean space transformation relationship between multiple views. Select the central view as the reference frame, establish the world coordinate system (O _w -X _w Y _w Z _w ), and solve the sparse space three-dimensional coordinates corresponding to the corresponding image points. On the basis of sparse reconstruction, the original mesh patch is generated and fed back to the viewing angles of each comparison frame, the feedback error is quantitatively evaluated by the optical flow field, and the deviation of each viewing image is used to drive the model deformation. Since the optical flow vector field contains the motion vector field information of spatial objects, the original polygonal mesh can be effectively corrected by means of the optical flow-scene flow analysis method. After obtaining the scene adjusted by the optical flow-scene flow, use The spatial scale factor obtained in the above steps can obtain the Euclidean reconstruction of the 3D scene.

以上所述，仅为本发明的较佳实施例而已，并非用于限定本发明的保护范围，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, and is not used to limit the protection scope of the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the within the protection scope of the present invention.

Claims

1. The method for determining the scale factor in monocular vision reconstruction based on point structured light distance measurement, characterized in that: in the process of calculating the scale factor; the entire calibration is unified on the camera coordinate system; the internal parameters of the camera use a camera based on a 2D plane target The calibration method is used for calibration. When calibrating the parameters of the structured light system, the spatial positions of the camera and the laser are kept unchanged; Take the image at this time, repeat this step to place the target any number of times to obtain a set of data images, and obtain the spot information corresponding to the target at each position according to the background difference method; use the corner point information on the target to obtain Putting the external parameter matrix of the target plane, plus the coordinate information of the center of mass of the laser spot in each calibration image, the spatial coordinates of the center of mass of the laser spot are obtained each time the target is placed; The spatial equation of the laser light can be obtained by performing Levenberg-Marquardt fitting on the results of the coordinates; the spatial coordinates of the laser spot on the image plane can be obtained through the calibrated sensor internal parameter information; theoretically, the laser light and the camera optical center and the laser spot are in the The coordinate lines of the image points on the image will completely coincide. However, due to calibration and calculation errors, these two lines will form straight lines with different planes. Set the midpoint of the common vertical line of the two straight lines with different planes as the real space coordinates of the laser spot. , the real scale factor is the ratio of the real space coordinates of the spot obtained by using the point structure cursor to the space coordinates of the spot obtained by using the three-dimensional reconstruction algorithm, so as to achieve the effect of Euclidean reconstruction;

This method specifically includes the following steps:

(1) Center positioning of spot light spot

Preprocess the image through various filtering or threshold selection methods, and then locate the centroid of the image spot; after processing the entire image pixel, according to the accumulated value of the final centroid parameter group of each spot, calculate according to the following first-order moment centroid The formula calculates the row and column coordinates of its centroid:

In the above formula, I(x, y) represents the gray value of the input image pixel, x, y are the row and column coordinates corresponding to the pixel, and X _c and Y _c are the row and column coordinates of the spot centroid respectively;

(2) Laser ray equation fitting with RANSAC elimination mechanism

RANSAC achieves the goal by repeatedly selecting a set of random subsets in the data; the selected subset is assumed to be an in-house point, and is verified by the following method. The model is described as follows:

1) Select n points from all spot centroids I as interior points, and use n points according to the least squares method to estimate the space linear equation L: Ax+By+Cz+D=0, that is, all unknown parameters of the equation (A ,B,C,D) can be calculated from the hypothetical interior points;

2) Use the model obtained in 1) to test all other remaining spot centroids, the number of spot centroids is I-n, if a certain spot centroid point is suitable for the estimated space linear model L, that is, if a certain spot centroid coordinates ( x', y', z') satisfy Ax'+By'+Cz'+D<|σ|, it is considered that this centroid point is also an internal point, where σ is the set threshold;

3) If there are enough spot centroid points classified as hypothetical interior points, that is, there are enough spot centroid points, assuming there are m points, satisfying Ax'+By'+Cz'+D<|σ|, Then the estimated model L is reasonable enough;

4) Then, use m internal points satisfying the condition to re-estimate the model, because it has only been estimated by the initial hypothetical internal point;

5) Finally, evaluate the model by estimating the error rate of the inlier points and the model;

This process is repeated for a fixed number of t times, and the model generated each time is either discarded because there are too few internal points, or selected because it is better than the existing model;

(3) Calculate the coordinates of the three-dimensional space point of the spot

However, the intersection point of the ray from the optical center of the camera passing through the center of mass of the spot and the ray from the laser cannot completely intersect at one point due to errors. The midpoint of the line segment is used as the spatial coordinates of the laser bright spot, so that the three-dimensional coordinates of the spatial point corresponding to the image coordinates of the spot centroid in the image can be obtained; the specific mathematical model is as follows: the ray O ₁ P ₁ direction vector sent by the optical center of the camera is V ₁ , the equation direction vector of the light O ₂ P ₂ emitted by the laser fitted in the camera coordinate system is expressed as V ₂ , then the direction vector of the common perpendicular is V=V ₁ ×V ₂ , and the common perpendicular is equal to O ₁ P ₁ , O ₂ P ₂ are respectively M ₁ (x ₁ ,y ₁ ,z ₁ ) and M ₂ (x ₂ ,y ₂ ,z ₂ ), then the spatial three-dimensional point coordinates of the laser spot are M(x,y, z), where x=(x ₁ +x ₂ )/2, y=(y ₁ +y ₂ )/2, z=(z ₁ +z ₂ )/2;

Use the back-projection iterative method to optimize the obtained spatial three-dimensional point coordinates. The specific method is as follows: firstly, the three-dimensional coordinates of the spatial point [x ,y,z], back-project this space point to the image, that is, find the intersection point [u',v'] of the three-dimensional point in space, the line connecting the optical center of the camera and the image coordinates, so that the centroid of the spot and the back-projection can be obtained The midpoint of the final image coordinates [(u+u')/2, (v+v')/2], and this is used as the spot centroid, and the three-dimensional point coordinates of the space are calculated according to the midpoint algorithm of the straight line and the vertical line of different planes. Repeat the above process continuously to calculate the distance δ between the three-dimensional coordinates M(x ^* , y ^* , z ^* ) of the spatial point and the straight line l projected by the laser. The spatial straight line equation is Ax+By+Cz+D=0, and the calculation formula is If δ is less than a certain threshold, the iteration ends so far;

(4) Realize Euclidean 3D reconstruction

Using the 3D reconstruction method of large monocular vision scenes driven by scene flow feedback, the specific steps are as follows: a hand-held free-moving camera collects multi-view images of the target scene, and the pixel matching relationship throughout the multi-view is established by the optical flow field between adjacent frames. Then use the five-point algorithm to solve the Euclidean space transformation relationship between multiple views; select the central view as the reference frame, establish the world coordinate system O _w -X _w Y _w Z _w , and solve the sparse space three-dimensional coordinates corresponding to the corresponding image points; On the basis of reconstruction, the original grid surface is generated and fed back to the viewing angles of each comparison frame, the feedback error is quantitatively evaluated by the optical flow field, and the deviation of each viewing image is used to drive the model deformation; since the optical flow vector field contains the movement of space objects Vector field information, so the original polygonal grid can be effectively corrected by means of the optical flow-scene flow analysis method. After obtaining the scene adjusted by the optical flow-scene flow, the spatial scale factor obtained by the above steps can be obtained. Euclidean reconstruction of 3D scenes.