CN114037743A

CN114037743A - A Robust Registration Method for 3D Point Clouds of Terracotta Warriors Based on Dynamic Graph Attention Mechanism

Info

Publication number: CN114037743A
Application number: CN202111245398.1A
Authority: CN
Inventors: 张海波; 海琳琦; 鱼跃华; 岳子璇; 李倩红; 李康; 耿国华
Original assignee: Northwest University
Current assignee: Northwest University
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2022-02-11
Anticipated expiration: 2041-10-26
Also published as: CN114037743B

Abstract

The invention discloses a method for robust registration of three-dimensional point clouds of terracotta figurines based on a dynamic graph attention mechanism. Net network, replace the convolutional layer with B‑NHN‑Conv, replace the deconvolutional layer with B‑NHN‑ConvTr, and embed the residual module and dynamic graph attention mechanism into the U‑Net network to get the point Cloud registration network; step 3, input the 3D point cloud of Terracotta Warriors with different resolutions into the point cloud registration network, and train it under the supervision of the circle loss loss function and the overlap loss loss function; step 4, use the training The completed point cloud registration network extracts the 3D point cloud features of the Terracotta Warriors, and combines the RANSAC algorithm to estimate the change matrix between the source point cloud and the target point cloud to complete the registration of the Terracotta Warriors 3D point cloud. The registration method provided by the present invention can still learn robust features under the condition that the resolution of the point cloud does not match and contains a lot of noise, and can better complete the registration of the point cloud under low overlap.

Description

A Robust Registration Method for 3D Point Clouds of Terracotta Warriors Based on Dynamic Graph Attention Mechanism

技术领域technical field

本发明涉及三维点云模型配准技术，具体涉及一种基于动态图注意力机制的秦俑三维点云鲁棒配准方法。The invention relates to a three-dimensional point cloud model registration technology, in particular to a method for robust registration of three-dimensional point clouds of Qin figurines based on a dynamic graph attention mechanism.

背景技术Background technique

点云配准技术在秦俑虚拟复原、秦陵智慧博物馆等项目中发挥着重要的作用，点云配准结果的准确是后续三维重建的关键。点云配准的目的在于通过计算得到一个完美的坐标变换，通过旋转平移等刚性变换，将不同视角下的点云数据统一整合到指定的坐标系中。The point cloud registration technology plays an important role in projects such as the virtual restoration of the Terracotta Warriors and the Qinling Wisdom Museum. The accuracy of the point cloud registration results is the key to the subsequent 3D reconstruction. The purpose of point cloud registration is to obtain a perfect coordinate transformation through calculation, and to uniformly integrate point cloud data from different perspectives into a specified coordinate system through rigid transformations such as rotation and translation.

目前，秦俑以及相关文物的配准方法大多是基于优化的传统配准方法，其中较为经典的方法是迭代最近点(Iterative Closest Point)算法。该方法主要包含两个阶段：对应搜索和变换估计。这两个阶段将反复进行，以找到点云间最佳转换。然而在处理初始位置差异大、点云分辨率不匹配、噪声干扰强以及重叠程度小的场景时，上述方法容易陷入局部最优。近年来，有研究人员提出了基于深度学习的方法来学习鲁棒的特征计算对应点，最终通过RANSAC或SVD算法来最终确定变换矩阵，而无需对应估计和变换估计之间的迭代。然而这类算法在面对部分重叠、密度变化、噪声等挑战时不能快速、鲁棒地处理。At present, most of the registration methods of Qin figurines and related cultural relics are traditional registration methods based on optimization, among which the more classic method is the Iterative Closest Point algorithm. The method mainly consists of two stages: correspondence search and transform estimation. These two stages are repeated to find the best transition between point clouds. However, when dealing with scenes with large initial position differences, mismatched point cloud resolutions, strong noise interference, and small overlap, the above methods are prone to fall into local optimum. In recent years, some researchers have proposed methods based on deep learning to learn robust feature calculation corresponding points, and finally determine the transformation matrix through RANSAC or SVD algorithm without iteration between correspondence estimation and transformation estimation. However, such algorithms cannot be fast and robust in the face of challenges such as partial overlap, density variation, and noise.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的不足，本发明的目的在于提供一种在面对部分重叠和密度变化时能鲁棒地处理的基于动态图注意力机制的秦俑三维点云鲁棒配准方法。In view of the deficiencies in the prior art, the purpose of the present invention is to provide a robust registration method for 3D point clouds of Terracotta Warriors based on dynamic graph attention mechanism, which can robustly handle partial overlap and density changes.

为了实现上述目的，本发明采用以下技术方案予以实现：In order to achieve the above object, the present invention adopts the following technical solutions to realize:

一种基于动态图注意力机制的秦俑三维点云鲁棒配准方法，包括如下步骤：A robust registration method for 3D point cloud of Terracotta Warriors based on dynamic graph attention mechanism, comprising the following steps:

步骤1，通过三维扫描仪获取不同分辨率的秦俑三维点云；Step 1, obtain 3D point clouds of Terracotta Warriors with different resolutions through a 3D scanner;

步骤2，在U-Net网络中，将卷积层替换为B-NHN-Conv，将反卷积层替换为B-NHN-ConvTr，并将残差模块和动态图注意力机制嵌入到U-Net网络中，得到点云配准网络；Step 2, in the U-Net network, replace the convolutional layer with B-NHN-Conv, replace the deconvolutional layer with B-NHN-ConvTr, and embed the residual module and dynamic graph attention mechanism into U- Net network, get point cloud registration network;

步骤3，将不同分辨率的秦俑三维点云输入到点云配准网络中，在circle loss损失函数和overlap loss损失函数的共同监督下训练构建的点云配准网络；Step 3: Input the three-dimensional point clouds of the Terracotta Warriors with different resolutions into the point cloud registration network, and train the constructed point cloud registration network under the joint supervision of the circle loss loss function and the overlap loss loss function;

步骤4，利用训练完成的点云配准网络提取秦俑三维点云特征，并结合RANSAC算法来估计源点云与目标点云之间的变化矩阵，完成秦俑三维点云的配准。Step 4: Use the trained point cloud registration network to extract the 3D point cloud features of the Terracotta Warriors, and combine the RANSAC algorithm to estimate the change matrix between the source point cloud and the target point cloud to complete the registration of the Terracotta Warriors 3D point cloud.

进一步地，所述步骤2中的U-Net网络还包括编码器模块和解码器模块，所述编码器模块通过多次的下采样获得多个尺度特征，并将多层感知器中所有全连接层转换为一系列内核大小为1×1×1的全卷积层，通道数为(64,128,256,512)；所述解码器模块每上采样一次，就通过跳跃连接结构和特征提取部分对应通道数相同尺度的特征进行融合，还原降采样所带来的信息损失。Further, the U-Net network in the step 2 also includes an encoder module and a decoder module, and the encoder module obtains multiple scale features through multiple downsampling, and fully connects all the multi-layer perceptrons. The layers are converted into a series of fully convolutional layers with a kernel size of 1 × 1 × 1, and the number of channels is (64, 128, 256, 512); each time the decoder module is up-sampled, the skip connection structure and the feature extraction part correspond to the same scale of the number of channels. The features are fused to restore the information loss caused by downsampling.

进一步地，所述步骤2中的残差模块由多个残差块串联而成，用于将输入信息跳跃连接到输出信息。Further, the residual module in the step 2 is formed by connecting a plurality of residual blocks in series, and is used to jump-connect the input information to the output information.

进一步地，所述步骤2中动态图注意力机制包括多层自注意力模块和交叉注意力模块的组合模块，所述动态图注意力机制基于注意权重a_ij为每个查询节点找到前k个边，并且仅使用前k个边和相应的节点来构造图，并在动态图注意力机制的每一层都根据变化后的注意权重a_ij构造一个新的图。Further, in the step 2, the dynamic graph attention mechanism includes a combination module of a multi-layer self-attention module and a cross-attention module, and the dynamic graph attention mechanism finds the top k for each query node based on the attention weight a _ij . edges, and only the first k edges and corresponding nodes are used to construct the graph, and a new graph is constructed according to the changed attention weights a _ij at each layer of the dynamic graph attention mechanism.

本发明与现有技术相比，具有如下技术效果：Compared with the prior art, the present invention has the following technical effects:

本发明使用U-Net作为主干架构，通过嵌入残差块解决梯度消失或梯度爆炸问题；通过嵌入动态图注意力机制来聚合点云的局部特征和上下文特征，可以获得多层次、具有更加丰富的语义表征，也能更好地帮助网络确定点云间可能的重叠区域；并通过B-NHN-Conv卷积操作减少或去除特征均值和标准偏差的潜在变化，提高了三维特征对点密度变化的鲁棒性；这使得本发明构建的点云配准网络在点云分辨率不匹配、包含大量噪声的情况下仍能学习鲁棒的特征并较好地完成低重叠度下点云的配准。The present invention uses U-Net as the backbone architecture, and solves the problem of gradient disappearance or gradient explosion by embedding residual blocks; by embedding dynamic graph attention mechanism to aggregate local features and context features of point clouds, multi-level and richer features can be obtained. Semantic representation can also better help the network determine possible overlapping areas between point clouds; and reduce or remove potential changes in feature mean and standard deviation through the B-NHN-Conv convolution operation, which improves the sensitivity of 3D features to point density changes. Robustness; this enables the point cloud registration network constructed in the present invention to learn robust features and better complete the registration of point clouds under low overlap even when the resolution of point clouds does not match and contains a lot of noise .

进一步地，将多层感知器中的所有全连接层转换为一系列内核大小为1×1×1的全卷积层，可加快网络处理的效率。Further, converting all fully connected layers in a multilayer perceptron into a series of fully convolutional layers with a kernel size of 1 × 1 × 1 can speed up the efficiency of network processing.

附图说明Description of drawings

图1为本发明一种基于动态图注意力机制的秦俑三维点云鲁棒配准方法的流程图；1 is a flowchart of a method for robust registration of three-dimensional point clouds of Terracotta Warriors based on a dynamic graph attention mechanism of the present invention;

图2为点云配准网络模型图；Fig. 2 is the network model diagram of point cloud registration;

图3为动态图注意力机制的结构示意图；Figure 3 is a schematic structural diagram of a dynamic graph attention mechanism;

图4为残差模块的结构示意图；FIG. 4 is a schematic structural diagram of a residual module;

图5为秦俑两头部点云初始位姿图；Figure 5 is the initial pose diagram of the point cloud of the two heads of the Qin figurines;

图6为秦俑两头部点云配准结果图；Figure 6 is the result of point cloud registration of the two heads of the Qin figurines;

图7为秦俑两足部点云初始位姿图；Figure 7 is the initial pose diagram of the point cloud of the two feet of the Qin figurines;

图8为秦俑两足部点云配准结果图。Figure 8 shows the result of point cloud registration on the feet of the Qin figurines.

具体实施方式Detailed ways

以下结合实施例对本发明的具体内容做进一步详细解释说明。The specific content of the present invention will be further explained in detail below in conjunction with the embodiments.

参照图1-4，本实施例提供一种基于动态图注意力机制的秦俑三维点云鲁棒配准方法，包括如下步骤：1-4, this embodiment provides a method for robust registration of 3D point clouds of Terracotta Warriors based on a dynamic graph attention mechanism, including the following steps:

步骤2，在U-Net网络中，将卷积层替换为B-NHN-Conv，将反卷积层替换为B-NHN-ConvTr，并将残差模块和动态图注意力机制嵌入到U-Net网络中，得到点云配准网络；其中，Step 2, in the U-Net network, replace the convolutional layer with B-NHN-Conv, replace the deconvolutional layer with B-NHN-ConvTr, and embed the residual module and dynamic graph attention mechanism into U- Net network, the point cloud registration network is obtained; among them,

B-NHN-Conv是结合了B-NHN归一化与后续的三维稀疏卷积的操作，且B-NHN归一化与三维稀疏卷积之间紧密耦合；B-NHN-ConvTr是将B-NHN-Conv卷积操作进行转置，即B-NHN归一化操作与三维稀疏转置卷积函数结合；B-NHN归一化操作可减少或去除特征均值和标准偏差的潜在变化，提高三维特征对点密度变化的鲁棒性。B-NHN-Conv is a combination of B-NHN normalization and subsequent three-dimensional sparse convolution operations, and the B-NHN normalization and three-dimensional sparse convolution are tightly coupled; B-NHN-ConvTr is a combination of B- The NHN-Conv convolution operation is transposed, that is, the B-NHN normalization operation is combined with the three-dimensional sparse transpose convolution function; the B-NHN normalization operation can reduce or remove the potential changes in the feature mean and standard deviation, and improve the three-dimensional Robustness of features to changes in point density.

残差模块由多个残差块串联而成，用于将输入信息跳跃连接到输出信息；可缓解有效特征信息丢失的问题，使得网络更容易得到优化，更有助于建立两点云间的对应关系。The residual module is composed of multiple residual blocks in series, which is used to jump connect the input information to the output information; it can alleviate the problem of the loss of effective feature information, make the network easier to optimize, and help to establish the relationship between the two point clouds. Correspondence.

动态图注意力机制包括多层自注意力模块(self-attention)和交叉注意力模块(cross-attention)的组合模块，也称动态图注意力模块。所述动态图注意力机制基于注意权重a_ij为每个查询节点找到前k个边，并且仅使用前k个边和相应的节点来构造图，并在动态图注意力机制的每一层都根据变化后的注意权重a_ij构造一个新的图。The dynamic graph attention mechanism includes a combination of multiple layers of self-attention and cross-attention, also known as dynamic graph attention. The dynamic graph attention mechanism finds the top k edges for each query node based on the attention weights a _ij , and uses only the top k edges and corresponding nodes to construct the graph, and uses the dynamic graph attention mechanism at each layer of the dynamic graph attention mechanism. Construct a new graph according to the changed attention weights a _ij .

动态图注意力机制首先通过线性投影对图中的特征节点计算查询向量

键向量

和值向量

其中

表示实数集，b为特征的维数，用于图结构更新和注意力聚合，如下公式所示：The dynamic graph attention mechanism first calculates the query vector for the feature nodes in the graph through linear projection

key vector

sum value vector

in

Represents the set of real numbers, and b is the dimension of the feature, which is used for graph structure update and attention aggregation, as shown in the following formula:

q_i＝W₁ ^(l)f_i ^Q+b₁ q _i =W ₁ ^(l) f _i ^Q +b ₁

其中，{Q,S}∈{X,Y}²，当Q＝S时，表示自注意力；Q≠S时表示交叉注意力；W和b为可学习的投影参数，^(l)f_i ^Q表示点云Q中位于动态图注意模块l层的关键点的特征，可表示为查询节点，点云S中所有节点可以表示为源节点。Among them, {Q,S}∈{X,Y} ² , when Q=S, it means self-attention; when Q≠S, it means cross-attention; W and b are learnable projection parameters, ^(l) f _i ^Q represents the feature of the key point located at the l layer of the dynamic graph attention module in the point cloud Q, which can be represented as a query node, and all nodes in the point cloud S can be represented as source nodes.

通过计算q_i和各个k_j的相似性或者相关性，得到每个k_j对应v_j的权重系数α_ij，然后对v_j进行加权求和，即得到了最终的Attention数值，消息通过加权平均值来计算，计算公式如下：By calculating the similarity or correlation between qi and each k _j , the weight coefficient α _{ij of each k j corresponding to v j} _{is obtained, and then the weighted summation of v j} _is _performed _, that is, the final Attention value is obtained. value, the calculation formula is as follows:

其中权重系数

表示特征(l)f_i ^Q对特征

的关注程度；ε∈{ε_self,ε_cross}。where the weight factor

Represents feature (l)f _i ^Q pair feature

the degree of attention; ε∈{ε _self ,ε _cross }.

一旦对所有层进行了聚合，节点的最终特征可表示为：Once all layers are aggregated, the final features of nodes can be expressed as:

f_i＝Wf_i+bf _i =Wf _i +b

所述U-Net网络还包括用于提取特征的编码器模块和对特征进行融合的解码器模块，所述编码器模块通过多次的下采样获得多个尺度特征，并将多层感知器中所有全连接层转换为一系列内核大小为1×1×1的全卷积层，通道数为(64,128,256,512)；所述解码器模块每上采样一次，就通过跳跃连接结构和特征提取部分对应通道数相同尺度的特征进行融合，还原降采样所带来的信息损失。The U-Net network also includes an encoder module for extracting features and a decoder module for fusing features, the encoder module obtains multiple scale features through multiple downsampling, and combines the multi-layer perceptron All fully connected layers are converted into a series of fully convolutional layers with a kernel size of 1×1×1, and the number of channels is (64, 128, 256, 512). The features of the same scale are fused to restore the information loss caused by downsampling.

步骤3，将不同分辨率的秦俑三维点云输入到点云配准网络中，在circle loss损失函数(循环损失函数)和overlap loss损失函数(重叠损失函数)的共同监督下训练构建的点云配准网络；Step 3: Input the 3D point clouds of the Terracotta Warriors with different resolutions into the point cloud registration network, and train the constructed points under the joint supervision of the circle loss function (cycle loss function) and the overlap loss function (overlap loss function). cloud registration network;

Circle loss是点云配准中常见的triplet loss的一种变体，损失函数公式如下所示：Circle loss is a variant of triplet loss commonly used in point cloud registration. The loss function formula is as follows:

其中，重叠点云对X、Y已对齐，n_x表示从点云X随机采样后的点云数量，ε_x(x_i)表示以点x_i为中心，半径r_x内的所有属于点云Y的点，ε_n(x_i)表示以点x_i为中心，半径r_x外的所有属于点云Y的点，

表示特征空间中两特征间的距离，Δ_x和Δ_n分别表示正负样本间隔，权重

由超参数γ、

以及样本正样本间隔Δ_x所决定，同样权重

由超参数γ、

以及样本负样本间隔Δ_n所决定，

计算方法同理，所以circle loss最终的损失函数为

Among them, the overlapping point cloud pairs X and Y have been aligned, n _x represents the number of point clouds randomly sampled from the point cloud X, ε _x (x _i ) represents the point x _i as the center, and all points within the radius r _x belong to the point cloud The point of Y, ε _n (x _i ) represents all the points belonging to the point cloud Y outside the radius r _x with the point x _i as the center,

Represents the distance between two features in the feature space, _Δx and _Δn represent the positive and negative sample intervals, respectively, and the weight

By the hyperparameter γ,

And the sample positive sample interval Δ _x is determined, the same weight

By the hyperparameter γ,

And the sample negative sample interval _Δn is determined,

The calculation method is the same, so the final loss function of circle loss is

重叠区域的估计被转换为二元分类，使用overlap loss进行监督，损失函数如下所示：The estimates of overlapping regions are converted to binary classification, supervised using overlap loss, and the loss function is as follows:

其中，真实的标签

表示点x_i是否为重叠区域，

表示网络预测的标签。

计算方法同理。Among them, the true label

Indicates whether point _xi is an overlapping area,

Represents the label predicted by the network.

The calculation method is the same.

步骤4，将源点云和目标点云输入到经过训练的点云配准网络中，依次经过编码器模块，动态图注意力模块，最后通过解码器模块输出提取的特征，并结合RANSAC算法计算变换矩阵完成配准。Step 4: Input the source point cloud and the target point cloud into the trained point cloud registration network, pass through the encoder module, the dynamic graph attention module in turn, and finally output the extracted features through the decoder module, and calculate with the RANSAC algorithm The transformation matrix completes the registration.

图5、图7分别为通过三维扫描仪获取的秦俑两头部点云初始位姿图和秦俑两足部点云初始位姿图；图6、图8别人为通过本发明所述方法进行配准后的秦俑两头部点云配准结果图和秦俑两足部点云配准结果图。由图可知本发明所述方法针对点云部分重叠和密度变化时仍能鲁棒地处理。Figure 5 and Figure 7 are respectively the initial pose map of the point cloud of the two heads of the Qin figurines and the initial pose map of the point clouds of the two feet of the Qin figurines obtained by the 3D scanner; The registration result of the point cloud of the two heads of the Qin figurines and the point cloud registration result of the feet of the Qin figurines after the registration. It can be seen from the figure that the method of the present invention can still handle robustly when the point cloud is partially overlapped and the density changes.

Claims

1. A three-dimensional point cloud robust registration method for Qin warriors based on a dynamic graph attention mechanism is characterized by comprising the following steps:

step 1, acquiring three-dimensional point clouds of Qinhong figures with different resolutions through a three-dimensional scanner;

step 2, replacing the convolution layer with B-NHN-Conv, replacing the deconvolution layer with B-NHN-ConvTr, and embedding a residual module and a dynamic graph attention mechanism into the U-Net network to obtain a point cloud registration network;

step 3, inputting three-dimensional point clouds of Qin warriors with different resolutions into a point cloud registration network, and training the constructed point cloud registration network under the joint supervision of a circle loss function and an overlap loss function;

and 4, extracting three-dimensional point cloud characteristics of the Qin warriors by using the trained point cloud registration network, and estimating a change matrix between the source point cloud and the target point cloud by combining a RANSAC algorithm to complete registration of the three-dimensional point cloud of the Qin warriors.

2. The robust registration method for the three-dimensional point cloud of Qin warriors based on the dynamic graph attention mechanism as claimed in claim 1, wherein the U-Net network in step 2 further comprises an encoder module and a decoder module, the encoder module obtains a plurality of scale features through a plurality of downsampling, and converts all full connection layers in the multi-layer sensor into a series of full convolution layers with the kernel size of 1 x 1, and the number of channels is (64,128,256, 512); and the decoder module fuses the features with the same scale and the number of corresponding channels through the jump connection structure and the feature extraction part every time sampling is carried out, so that the information loss caused by down-sampling is reduced.

3. The robust registration method for the three-dimensional point cloud of the Qin warriors based on the dynamic graph attention mechanism as claimed in claim 1, wherein the residual error module in step 2 is formed by connecting a plurality of residual error blocks in series, and is used for jumping and connecting the input information to the output information.

4. The robust registration method for the three-dimensional point cloud of the Qin warriors based on the dynamic graph attention mechanism as claimed in claim 1, wherein the dynamic graph attention mechanism in the step 2 comprises a combination module of a multilayer self-attention module and a cross-attention module, and the dynamic graph attention mechanism is based on the attention weight a_ijFinding the first k edges for each query node, and constructing a graph using only the first k edges and corresponding nodes, and based on the changed attention weight α at each level of the dynamic graph attention mechanism_ijA new graph is constructed.