CN110853075A

CN110853075A - A visual tracking and localization method based on dense point cloud and synthetic view

Info

Publication number: CN110853075A
Application number: CN201911070899.3A
Authority: CN
Inventors: 陈靖; 缪远东
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2019-11-05
Filing date: 2019-11-05
Publication date: 2020-02-28
Anticipated expiration: 2039-11-05
Also published as: CN110853075B

Abstract

The invention provides a visual tracking and positioning method based on dense point clouds and synthetic views. The process is as follows: using three-dimensional scanning of a real scene to obtain a color key frame image and a corresponding depth image, after image restoration, the key frame image is processed. Image coding: perform image coding on the current frame image obtained by the camera in real time, and select a composite image with the closest coding distance to the current frame image as the reference frame image of the current image; obtain a stable set of matching feature points on the two images, and Perform processing to obtain the six-degree-of-freedom pose information of the current frame camera relative to the 3D scanning point cloud coordinate system; use the optical flow algorithm to determine, if the requirements cannot be met, update the current frame image to the next frame image obtained by the camera, and repeat match. The invention can solve the association problem between the three-dimensional point cloud obtained by the laser radar and the heterologous visual image, and has the effect of realizing the rapid initialization and positioning of the visual navigation.

Description

A visual tracking and localization method based on dense point cloud and synthetic view

技术领域technical field

本发明属于信息技术领域，具体涉及一种基于稠密点云与合成视图的视觉跟踪定位方法。The invention belongs to the field of information technology, and in particular relates to a visual tracking and positioning method based on dense point clouds and synthetic views.

背景技术Background technique

大规模户外场景的视觉实时跟踪定位一直以来是计算机视觉领域的重要研究方向。户外复杂多变的环境因素，如光照、视角、遮挡、弱纹理以及随时间运动的物体等都对视觉跟踪定位算法的精度和鲁棒性造成较大的影响。荒漠、草原、山地等户外作战场景由于地域广，场景纹理信息不丰富，对视觉跟踪定位算法提出了更高的技术挑战。目前较常采用的户外大场景视觉跟踪定位算法是SLAM(实时定位与地图构建)算法。该类算法可分为基于特征的SLAM算法和基于直接法的SLAM算法。基于特征的SLAM算法可获得场景的稀疏地图和摄像机的实时运动轨迹。基于直接法的SLAM算法可获得场景的稠密地图和摄像机的实时运动轨迹。特征法和直接法各有其优缺点，分别适用于不同的应用领域。目前SLAM算法的局限性在于由于场景地图与位姿需要同时进行估算，因此计算量大、摄像机姿态估计存在误差累积漂移，且难以处理动态场景、无法实现大规模场景的应用。Visual real-time tracking and positioning of large-scale outdoor scenes has always been an important research direction in the field of computer vision. Outdoor complex and changeable environmental factors, such as illumination, viewing angle, occlusion, weak textures, and objects moving with time, have a great impact on the accuracy and robustness of visual tracking and positioning algorithms. Outdoor combat scenes such as deserts, grasslands, and mountains have a wide area and lack of rich scene texture information, which poses higher technical challenges to the visual tracking and positioning algorithm. At present, the most commonly used outdoor large scene visual tracking and positioning algorithm is the SLAM (real-time positioning and map building) algorithm. This kind of algorithm can be divided into feature-based SLAM algorithm and direct method-based SLAM algorithm. The feature-based SLAM algorithm can obtain the sparse map of the scene and the real-time motion trajectory of the camera. The SLAM algorithm based on the direct method can obtain the dense map of the scene and the real-time motion trajectory of the camera. The characteristic method and the direct method have their own advantages and disadvantages, and they are suitable for different application fields. The limitation of the current SLAM algorithm is that since the scene map and pose need to be estimated at the same time, the calculation amount is large, the camera pose estimation has error accumulation drift, and it is difficult to deal with dynamic scenes and cannot be applied to large-scale scenes.

为解决SLAM算法计算量大、姿态估计误差累积漂移的问题，可采用离线构建地图，在线进行跟踪定位的方式。离线地图可采用基于视觉的SFM算法(运动结构重建)或是利用三维激光雷达扫描获取。由于SFM算法获取的场景地图存在尺度不确性问题，因此难以在户外增强现实军事仿真训练中得以应用。而利用激光雷达获取户外场景的三维点云，具有扫描精度高且尺度确定的特点。因此，利用激光雷达获取场景的三维地图并用于后续基于视觉的跟踪定位，成为一个重要的研究方向。激光雷达构建的三维地图与相机获取的图像属于异源数据，存在光照变化、传感器差异、视角不同等各种不利因素影响，需要着重解决激光雷达获取的三维点云与视觉图像的关联问题。In order to solve the problem of the large amount of calculation of the SLAM algorithm and the accumulated drift of the attitude estimation error, the offline map can be constructed and the online tracking and positioning method can be adopted. Offline maps can be acquired using vision-based SFM algorithms (structural reconstruction of motion) or 3D lidar scanning. Due to the scale uncertainty of the scene map obtained by the SFM algorithm, it is difficult to be applied in outdoor augmented reality military simulation training. The use of lidar to obtain 3D point clouds of outdoor scenes has the characteristics of high scanning accuracy and definite scale. Therefore, it has become an important research direction to use lidar to obtain a 3D map of the scene and use it for subsequent vision-based tracking and positioning. The 3D map constructed by lidar and the image obtained by the camera belong to heterogeneous data, and there are various unfavorable factors such as illumination changes, sensor differences, and different viewing angles.

发明内容SUMMARY OF THE INVENTION

本发明的目的是为了克服已有技术的缺陷，为了解决激光雷达三维扫描数据与图像数据的关联，使其能够用于视觉跟踪定位的问题，提出一种基于稠密点云与合成视图的视觉跟踪定位方法。The purpose of the present invention is to overcome the defects of the prior art, in order to solve the problem of the correlation between the three-dimensional scanning data of the lidar and the image data, so that it can be used for visual tracking and positioning, a visual tracking based on dense point cloud and synthetic view is proposed. positioning method.

本发明方法是通过下述技术方案实现的：The inventive method is achieved through the following technical solutions:

一种基于稠密点云与合成视图的视觉跟踪定位方法，具体过程为：A visual tracking and positioning method based on dense point cloud and synthetic view, the specific process is as follows:

步骤一：利用三维激光雷达扫描仪在固定位置对真实场景进行三维扫描，获取真实场景的三维点云模型以及与点云模型相对应的彩色图像；Step 1: use a 3D lidar scanner to perform 3D scanning on the real scene at a fixed position, and obtain a 3D point cloud model of the real scene and a color image corresponding to the point cloud model;

步骤二:将带有颜色信息的三维点云模型利用投影变换算法合成出多个特定视点下的彩色关键帧图像和对应的深度图像；Step 2: the three-dimensional point cloud model with color information utilizes the projection transformation algorithm to synthesize the color key frame image and the corresponding depth image under a plurality of specific viewpoints;

步骤三：利用所述深度图像对合成的彩色关键帧图像利用图像修复算法，修复图像中的空洞；Step 3: using the depth image to use an image repair algorithm to the synthesized color key frame image to repair the holes in the image;

步骤四:提取所有修复后彩色关键帧图像上的特征点，并进行离线训练，构建码书，并对关键帧图像进行图像编码；Step 4: extract the feature points on the color key frame images after all the repairs, and carry out offline training, build a codebook, and carry out image coding to the key frame images;

步骤五：将摄像机实时获取的当前帧图像进行图像编码，并选出与当前帧图像编码距离最近的一张合成图像作为当前图像的参考帧图像；Step 5: Perform image coding on the current frame image obtained by the camera in real time, and select a composite image with the closest coding distance to the current frame image as the reference frame image of the current image;

步骤六：分别提取当前帧与步骤五中选取的参考帧图像上的特征点，利用特征匹配算法获得稳定的匹配特征点集合；Step 6: Extract the feature points on the current frame and the reference frame image selected in step 5 respectively, and obtain a stable set of matching feature points by using a feature matching algorithm;

步骤七：利用参考帧图像上的特征点与三维扫描点之间的投影对应关系，构建当前帧匹配特征点与三维扫描点之间的2D/3D空间匹配关系；Step 7: Using the projection correspondence between the feature points on the reference frame image and the 3D scanning points, construct a 2D/3D space matching relationship between the current frame matching feature points and the 3D scanning points;

步骤八：根据所述空间匹配关系，利用当前帧摄像机相对于三维扫描点云坐标系的六自由度位姿信息；Step 8: According to the spatial matching relationship, use the six-degree-of-freedom pose information of the current frame camera relative to the three-dimensional scanning point cloud coordinate system;

步骤九：利用光流算法预测当前帧图像上匹配的特征点在下一帧图像的位置；Step 9: Use the optical flow algorithm to predict the position of the matching feature point on the current frame image in the next frame image;

步骤十：若光流跟踪的匹配点数大于设定的阈值，则根据步骤八计算的得到的六自由度位姿信息结合视觉SLAM算法进行后续姿态估计，若光流跟踪的匹配点数小于设定的阈值，则更新当前帧图像为摄像机获取的下一帧图像，返回步骤五。Step 10: If the number of matching points of optical flow tracking is greater than the set threshold, then the 6-DOF pose information calculated in step 8 is combined with the visual SLAM algorithm for subsequent pose estimation. If the number of matching points of optical flow tracking is less than the set value If the threshold is set, the current frame image is updated to be the next frame image obtained by the camera, and the process returns to step 5.

有益效果beneficial effect

本发明方法，对比已有技术，本发明通过将激光雷达获取的三维点云投影合成图像，进行图像修复、检索和匹配，因此能够解决激光雷达获取的三维点云与异源视觉图像的关联问题，具有实现视觉导航快速初始化定位的效果。Compared with the prior art, the method of the present invention performs image repair, retrieval and matching by projecting the three-dimensional point cloud obtained by the laser radar to synthesize the image, so it can solve the problem of association between the three-dimensional point cloud obtained by the laser radar and the heterologous visual image , which has the effect of realizing rapid initialization and positioning of visual navigation.

附图说明Description of drawings

图1为本发明实施方式的算法流程示意图；FIG. 1 is a schematic flowchart of an algorithm according to an embodiment of the present invention;

图2为Faro扫描的真实场景三维模型图；Figure 2 is a 3D model of a real scene scanned by Faro;

图3为合成关键帧图像和对应深度图，左侧为关键帧，右侧为深度图；Figure 3 shows the synthesized key frame image and the corresponding depth map, the left side is the key frame, and the right side is the depth map;

图4为图像修复后的关键帧图像和对应深度图，左侧为关键帧，右侧为深度图；Figure 4 shows the key frame image and the corresponding depth map after image restoration, the left side is the key frame, and the right side is the depth map;

图5为当前帧图像与参考帧图像的特征匹配图，左侧为当前帧，右侧为参考帧。FIG. 5 is a feature matching diagram between the current frame image and the reference frame image, the left side is the current frame, and the right side is the reference frame.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述。To make the purposes, technical solutions, and advantages of the embodiments of the present invention clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention.

本发明的设计思想为：将激光雷达获得的三维点云经过空间反投影变换，生成已知视点下的合成图像，利用合成图像与摄像机获取的实时图像进行匹配的方式，估算摄像机的实时6自由度位姿。该方法可用于机器人、无人车、无人飞行器、虚拟现实与增强现实等领域的跟踪导航或辅助定位。The design idea of the present invention is as follows: the three-dimensional point cloud obtained by the lidar is subjected to spatial back-projection transformation to generate a synthetic image under known viewpoints, and the real-time 6 freedom of the camera is estimated by matching the synthetic image with the real-time image obtained by the camera. Degree pose. The method can be used for tracking navigation or assisted positioning in the fields of robots, unmanned vehicles, unmanned aerial vehicles, virtual reality and augmented reality.

本发明基于稠密点云与合成视图的视觉跟踪定位方法，如图1所示，具体步骤包括：The present invention is based on the visual tracking and positioning method of dense point cloud and synthetic view, as shown in FIG. 1 , and the specific steps include:

步骤一、将Faro激光雷达扫描仪放置在户外某一地点固定，扫描周围真实场景并获得三维点云模型以及与模型相对应的RGB图像。其中三维点云模型的每一个三维点(x_w,y_w,z_w,r,g,b)包含在点云模型坐标系下的坐标(x_w,y_w,z_w)^T和RGB颜色信息(r,g,b)。Step 1. Place the Faro lidar scanner at a certain location outdoors, scan the surrounding real scene, and obtain a 3D point cloud model and RGB images corresponding to the model. Each 3D point (x _w , y _w , z _w , r, g, b) of the 3D point cloud model contains the coordinates (x _w , y _w , z _w ) ^T and RGB color in the coordinate system of the point cloud model information(r,g,b).

步骤二、将带有颜色信息的三维点云模型利用投影变换算法合成出特定视点下的彩色关键帧图像和对应的深度图像。其具体步骤包括：Step 2: Using the projection transformation algorithm to synthesize the 3D point cloud model with color information to synthesize a color key frame image and a corresponding depth image under a specific viewpoint. The specific steps include:

(1)根据步骤一获得的三维点云模型，利用三角网格化算法生成三角面片模型并将模型导入三维模型软件中。在三维模型软件中，通过调整虚拟相机的位姿获取k个不同视点下的旋转矩阵

和平移矢量

对点云中每一个三维点(x_w,y_w,z_w,r,g,b)利用公式(1)分别计算三维点在不同虚拟相机坐标系下的三维点坐标

(1) According to the 3D point cloud model obtained in step 1, use the triangular meshing algorithm to generate a triangular patch model and import the model into the 3D model software. In the 3D model software, the rotation matrices under k different viewpoints are obtained by adjusting the pose of the virtual camera

and translation vector

For each 3D point (x _w , y _w , z _w , r, g, b) in the point cloud, use formula (1) to calculate the 3D point coordinates of the 3D point in different virtual camera coordinate systems.

(2)根据真实相机的内部参数值设定虚拟相机的内部参数，其中包括焦距(f_x,f_y)、主点(u₀，v₀)，利用公式(2)将虚拟相机坐标系下的坐标

变换到图像坐标系下。(2) Set the internal parameters of the virtual camera according to the internal parameter values of the real camera, including the focal length (f _x , f _y ), the principal point (u ₀ , v ₀ ), and use formula (2) to set the virtual camera coordinate system coordinate of

Transform to the image coordinate system.

(3)将公式(2)计算获得的图像坐标(u^k，v^k)，根据真实相机的分辨率W×H进行剪裁，如果

或

舍弃，否则保留。(3) Crop the image coordinates (u ^k , v ^k ) obtained by formula (2) according to the resolution W×H of the real camera, if

or

Discard, otherwise keep.

(4)当投影到同一图像坐标(u^k，v^k)有多个对应的三维点(x_w,y_w,z_w,r,g,b)时，比较三维点(x_w,y_w,z_w,r,g,b)在虚拟相机坐标系下的三维点坐标

的

分量，保留最小正值

所对应的三维点，消除点云前后遮挡问题。(4) When there are multiple corresponding three-dimensional points (x _w , y _w , z _w , r, g, b) projected to the same image coordinates (u ^k , v ^k ), compare the three-dimensional points (x _w , y _w ) ,z _w ,r,g,b) three-dimensional point coordinates in the virtual camera coordinate system

of

components, keep the smallest positive value

The corresponding 3D points can eliminate the occlusion problem before and after the point cloud.

(5)根据投影的图像坐标(u^k，v^k)分别赋予对应的三维点云(x_w,y_w,z_w,r,g,b)的颜色信息(r,g,b)和对应的虚拟相机坐标系下的三维点坐标

的

分量，合成出特定视点下的彩色关键帧图像和对应的深度图。(5) According to the projected image coordinates (u ^k , v ^k ), the color information (r, g, b) and the corresponding color information (r, g, b) of the corresponding three-dimensional point cloud (x _w , y _w , z _w , r, g, b) are assigned respectively. 3D point coordinates in the virtual camera coordinate system of

of

component to synthesize the color keyframe image and the corresponding depth map under a specific viewpoint.

步骤三、根据步骤二中得到的彩色关键帧图像和对应的深度图，利用基于深度信息的图像修复算法对步骤二中获得的彩色关键帧图像逐一进行图像修复，去除彩色关键帧图像中的空洞。具体实现步骤包括：Step 3: According to the color key frame image obtained in step 2 and the corresponding depth map, use an image restoration algorithm based on depth information to perform image restoration on the color key frame image obtained in step 2 one by one, and remove the holes in the color key frame image. . The specific implementation steps include:

(1)利用线性插值算法对深度图进行图像补全，去除深度图像中的空洞，(1) Use the linear interpolation algorithm to complete the depth map to remove the holes in the depth image,

(2)利用补全的深度图修复彩色关键帧图像：(2) Use the completed depth map to repair the color keyframe image:

读取彩色关键帧图像的每一个像素点(u,v)，判断其是否是空白点，若该点是空白点，向空白点周围的八个方向寻找参考点(i,j)，并从深度图中获取点(u,v)和参考点(i,j)的深度，参考点不能是空白点，根据公式(3)对参考点进行加权求和填补空白点。Read each pixel point (u, v) of the color key frame image, and determine whether it is a blank point. If the point is a blank point, look for the reference point (i, j) in the eight directions around the blank point, and from the depth map Obtain the depths of the point (u, v) and the reference point (i, j). The reference point cannot be a blank point. According to formula (3), the weighted sum of the reference points is performed to fill the blank point.

其中，点(u,v)为空白点，点(i,j)为参考点，R为参考点构成的集合，I_(i,j)为点(i,j)的颜色信息，I_(u,v)为填补空白点(u,v)所需的颜色信息，D_(i,j)为深度加权因子，当参考点(i,j)的深度与点(u,v)的深度差值小于设定的阈值时，D_(i,j)＝1，否则D_(i,j)＝0，W_(i,j)为点(i,j)的距离加权因子，由公式(4)计算得到，Among them, point (u, v) is a blank point, point (i, j) is a reference point, R is a set of reference points, I _{(i, j)} is the color information of point (i, j), I _{(u, v)} is the color information required to fill the blank point (u, v), D _{(i, j)} is the depth weighting factor, when the depth difference between the reference point (i, j) and the point (u, v) is less than the set When the threshold is determined, D _(i,j) = 1, otherwise D _{(i, j)} = 0, W _{(i, j)} is the distance weighting factor of point (i, j), calculated by formula (4),

(3)对关键帧图像中的每一个点都进行上述(2)步骤的处理，直至全图处理完毕，则图像修复完成。(3) Perform the processing of the above step (2) on each point in the key frame image until the entire image is processed, and the image restoration is completed.

步骤四、提取所有修复后彩色关键帧图像上的特征点，并进行离线训练，构建码书，并利用VLAD(Vector of Aggregate Locally Descriptor)算法对彩色关键帧图像进行图像编码。具体实现步骤包括：Step 4: Extract the feature points on all the repaired color key frame images, perform offline training, build a codebook, and use the VLAD (Vector of Aggregate Locally Descriptor) algorithm to encode the color key frame images. The specific implementation steps include:

(1)提取所有修复后彩色关键帧图像上的ORB特征点，构建特征描述子集合{x_j}，利用k-means聚类算法对ORB特征点集进行聚类，获得有M个聚类中心的码书，C＝{c₁ c₂ ...c_M}为码书的M个聚类中心；(1) Extract the ORB feature points on all the restored color key frame images, construct a feature description subset {x _j }, use the k-means clustering algorithm to cluster the ORB feature point set, and obtain M cluster centers The codebook of , C={c ₁ c ₂ ... c _M } is the M clustering centers of the codebook;

(2)将从一幅彩色关键帧图像上提取的每个特征描述子x_j利用K近邻算法与码书的M个聚类中心比较，按照最近邻原则分配到码书中与其最近邻聚类中心C_i＝NN(x_j)。(2) Each feature descriptor x _j extracted from a color key frame image is compared with the M cluster centers of the codebook using the K-nearest neighbor algorithm, and is assigned to its nearest neighbor cluster in the codebook according to the nearest neighbor principle Center C _i =NN(x _j ).

(3)计算每个特征描述子x_j与其最近邻聚类中心C_i的残差x_j-c_i，并对每幅彩色关键帧图像落在同一聚类中心上的残差进行累积构成VLAD向量

(3) Calculate the residual x _j _-ci of each feature descriptor x _j and its nearest neighbor cluster center C _i , and accumulate the residuals of each color key frame image that fall on the same cluster center to form VLAD vector

(4)将每个聚类中心上的累积残差矢量级联构建出该彩色关键帧图像的VLAD描述符。(4) The VLAD descriptor of the color key frame image is constructed by concatenating the cumulative residual vector on each cluster center.

步骤五、将摄像机实时获取的当前帧图像利用步骤四中的VLAD算法进行图像编码获取当前帧的VLAD图像描述符，并采用KNN(k近邻)算法选出与当前帧图像编码距离最近的一张彩色关键帧图像作为当前图像的参考帧图像。Step 5. Use the VLAD algorithm in step 4 to perform image coding on the current frame image obtained by the camera in real time to obtain the VLAD image descriptor of the current frame, and use the KNN (k nearest neighbor) algorithm to select the one closest to the encoding distance of the current frame image. The color key frame image is used as the reference frame image of the current image.

步骤六、分别提取当前帧与步骤五中选取的参考帧图像上的ORB特征点，利用特征匹配算法获得稳定的匹配特征点集合。具体步骤包括：Step 6: Extract the ORB feature points on the current frame and the reference frame image selected in step 5, respectively, and use a feature matching algorithm to obtain a stable set of matching feature points. Specific steps include:

(1)分别提取当前帧与参考帧图像的ORB特征，根据特征描述符的最近邻与次近邻之比确定初始特征匹配点集。(1) The ORB features of the current frame and the reference frame image are extracted respectively, and the initial feature matching point set is determined according to the ratio of the nearest neighbor to the next nearest neighbor of the feature descriptor.

(2)利用8点算法计算匹配特征点集构建的基础矩阵F；(2) Using the 8-point algorithm to calculate the basic matrix F constructed by the matching feature point set;

(3)根据计算的基础矩阵F构建匹配点的反投影误差，并利用RANSAC算法进一步去除野点。(3) Construct the back-projection error of the matching point according to the calculated fundamental matrix F, and use the RANSAC algorithm to further remove outliers.

步骤七、利用参考帧图像上的特征点与三维扫描点之间的投影对应关系，构建当前帧匹配特征点与三维扫描点之间的2D/3D空间匹配关系。Step 7: Using the projection correspondence between the feature points on the reference frame image and the 3D scanning points, construct a 2D/3D space matching relationship between the current frame matching feature points and the 3D scanning points.

步骤八、根据步骤七所建立的当前帧图像坐标与三维空间点坐标的空间匹配关系，利用OpenCV中的PnP(perspective-n-point)算法求解出当前摄像机相对于三维点云坐标系的六自由度位姿信息。Step 8. According to the spatial matching relationship between the current frame image coordinates and the three-dimensional space point coordinates established in step seven, use the PnP (perspective-n-point) algorithm in OpenCV to solve the six-freedom of the current camera relative to the three-dimensional point cloud coordinate system. Degree pose information.

步骤九、采用OpenCV中的光流跟踪算法，估算当前帧特征点在下一帧图像中的图像坐标。Step 9: Use the optical flow tracking algorithm in OpenCV to estimate the image coordinates of the feature points of the current frame in the next frame of image.

步骤十、根据步骤九的光流估计结果，统计能够稳定跟踪到的特征点个数，并与设定的阈值做比较，如果能够跟踪到的特征点个数大于阈值，则根据步骤八计算的得到的六自由度位姿信息结合视觉SLAM算法进行后续姿态估计；若能够跟踪到的特征点个数小于阈值，更新当前帧图像为摄像机获取的下一帧图像，则返回步骤五，重新选择参考帧图像。Step 10. According to the optical flow estimation result in step 9, count the number of feature points that can be tracked stably, and compare it with the set threshold. If the number of feature points that can be tracked is greater than the threshold, then according to step 8 The obtained six-degree-of-freedom pose information is combined with the visual SLAM algorithm for subsequent pose estimation; if the number of feature points that can be tracked is less than the threshold, update the current frame image to the next frame image obtained by the camera, then return to step 5 and re-select the reference frame image.

自此，就实现了基于稠密点云与合成视图的视觉跟踪定位。Since then, visual tracking and localization based on dense point clouds and synthetic views has been achieved.

如图2-5所示，图2为Faro扫描的真实场景三维模型图，利用真是场景三维模型图合成关键帧图像和对应的深度图像如图3所示，图3中左侧为关键帧图像，右侧为深度图像；经过步骤三进行图像修复后，修复后的图像如图4所示，左侧为修复后的关键帧图像，右侧为修复后的深度图像；图5为当前帧图像与参考帧图像的特征匹配图，左侧为当前帧图像，右侧为参考帧图像。As shown in Figure 2-5, Figure 2 is the 3D model image of the real scene scanned by Faro, and the key frame image and the corresponding depth image are synthesized using the real scene 3D model image, as shown in Figure 3, and the left side of Figure 3 is the key frame image , the right side is the depth image; after image repairing in step 3, the repaired image is shown in Figure 4, the left side is the repaired key frame image, and the right side is the repaired depth image; Figure 5 is the current frame image The feature matching map with the reference frame image, the left side is the current frame image, and the right side is the reference frame image.

综上所述，以上仅为本发明的较佳实施例而已，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。To sum up, the above are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. A visual tracking positioning method based on dense point cloud and synthetic view is characterized by comprising the following specific processes:

the method comprises the following steps: three-dimensional scanning is carried out on a real scene at a fixed position by using a three-dimensional laser radar scanner, and a three-dimensional point cloud model of the real scene and a color image corresponding to the point cloud model are obtained;

synthesizing a three-dimensional point cloud model with color information into color key frame images and corresponding depth images under a plurality of specific viewpoints by using a projection transformation algorithm;

step three: repairing the hole in the image by using the depth image and the synthesized color key frame image by using an image repairing algorithm;

extracting characteristic points on all repaired color key frame images, performing off-line training, constructing a code book, and performing image coding on the key frame images;

step five: carrying out image coding on a current frame image acquired by a camera in real time, and selecting a synthetic image closest to the current frame image coding distance as a reference frame image of the current image;

step six: respectively extracting the feature points on the current frame and the reference frame image selected in the step five, and obtaining a stable matched feature point set by using a feature matching algorithm;

step seven: constructing a 2D/3D space matching relationship between the current frame matching characteristic point and the three-dimensional scanning point by using the projection corresponding relationship between the characteristic point on the reference frame image and the three-dimensional scanning point;

step eight: solving the six-degree-of-freedom pose information of the current frame camera relative to the three-dimensional scanning point cloud coordinate system according to the space matching relation;

step nine: predicting the position of the matched feature point on the current frame image in the next frame image by using an optical flow algorithm;

step ten: and if the number of the matching points of the optical flow tracking is larger than the set threshold, performing subsequent attitude estimation by combining a visual SLAM algorithm according to the pose information of the six degrees of freedom calculated in the step eight, and if the number of the matching points of the optical flow tracking is smaller than the set threshold, updating the current frame image to be the next frame image acquired by the camera, and returning to the step five.

2. The method for visual tracking and positioning based on dense point cloud and synthesized view as claimed in claim 1, wherein the image coding is performed by using VLAD algorithm in the fourth step and the fifth step.

3. The method as claimed in claim 1, wherein the step five is to select a synthetic image with the closest coding distance to the current frame image by using KNN algorithm.

4. The visual tracking and positioning method based on the dense point cloud and the synthetic view as claimed in claim 1, wherein in the eighth step, a PnP algorithm is used to solve pose information of six degrees of freedom.

5. The visual tracking and positioning method based on the dense point cloud and the synthetic view as claimed in claim 1, wherein the specific process of the third step is:

(1) the linear interpolation algorithm is utilized to complete the image of the depth image, remove the holes in the depth image,

(2) and repairing the color key frame image by using the completed depth map:

reading each pixel point (u, v) of the color key frame image, judging whether the pixel point is a blank point, if the pixel point is the blank point, searching a reference point (i, j) in eight directions around the blank point, acquiring the depth of the point (u, v) and the reference point (i, j) from the depth map, wherein the reference point cannot be the blank point, and weighting and summing the reference point according to a formula (3) to fill the blank point.

Wherein, the point (u, v) is a blank point, the point (I, j) is a reference point, R is a set formed by the reference points, I_(i,j)Is the color information of the point (I, j), I_(u,v)To fill in the color information required for the blank spots (u, v), D_(i,j)For the depth weighting factor, when the difference between the depth of the reference point (i, j) and the depth of the point (u, v) is less than a set threshold value, D_(i,j)1, otherwise D_(i,j)＝0，W_(i,j)Is the distance weighting factor for point (i, j), calculated by equation (4),

(3) and (3) performing the processing of the step (2) on each point in the key frame image until the whole image processing is finished, and finishing the image restoration.