CN112488117B - Point cloud analysis method based on direction-induced convolution - Google Patents
Point cloud analysis method based on direction-induced convolution Download PDFInfo
- Publication number
- CN112488117B CN112488117B CN202011436923.3A CN202011436923A CN112488117B CN 112488117 B CN112488117 B CN 112488117B CN 202011436923 A CN202011436923 A CN 202011436923A CN 112488117 B CN112488117 B CN 112488117B
- Authority
- CN
- China
- Prior art keywords
- induced
- convolution
- point cloud
- point
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于方向诱导卷积的点云分析方法,包括构建方向诱导卷积模块,所述方向诱导卷积模块用于提取点云无序邻域的特征;基于方向诱导卷积模块,构建残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块;根据残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块构建方向诱导卷积网络;将点云数据输入方向诱导卷积网络,获得点云分割结果和分类结果。本发明以一种端到端的方式更好的捕获了点云的局部空间结构,提高了点云分类以及点云分割的准确率。
The invention discloses a point cloud analysis method based on direction-induced convolution. , construct the residual direction induced convolution module and the farthest point sampling residual direction induced convolution module; construct the direction induced convolution network according to the residual direction induced convolution module and the farthest point sampling residual direction induced convolution module; The input direction of point cloud data induces a convolutional network to obtain point cloud segmentation results and classification results. The present invention better captures the local spatial structure of the point cloud in an end-to-end manner, and improves the accuracy of point cloud classification and point cloud segmentation.
Description
技术领域technical field
本发明属于3D点云分析技术,具体为一种基于方向诱导卷积的点云分析方法。The invention belongs to 3D point cloud analysis technology, in particular to a point cloud analysis method based on direction-induced convolution.
背景技术Background technique
点云作为一种简单直接的三维数据表示,可以很好地描述三维空间中物体的几何和拓扑信息,在理解周围环境上应用广泛,近年来点云分析越来越受到关注。与一般图像不同,三维空间的点云具有不规则的无序拓扑结构,很难运用标准的卷积运算。为从数据表示上弥补这个问题,有些工作通过将原始点云数据转换为一组2D图像或3D体素表示,从而在常规网格上直接进行卷积处理。然而,这些转换通常会丢失大量固有几何信息,同时复杂性较高。As a simple and direct representation of 3D data, point cloud can well describe the geometric and topological information of objects in 3D space, and is widely used in understanding the surrounding environment. In recent years, point cloud analysis has attracted more and more attention. Unlike general images, point clouds in 3D space have an irregular and disordered topology, making it difficult to apply standard convolution operations. To remedy this problem in terms of data representation, some works directly perform convolution processing on regular grids by transforming raw point cloud data into a set of 2D images or 3D voxel representations. However, these transformations usually lose a lot of intrinsic geometric information, and at the same time have high complexity.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种基于方向诱导卷积的点云分析方法。The purpose of the present invention is to provide a point cloud analysis method based on direction-induced convolution.
实现本发明目的技术解决方案为:一种基于方向诱导卷积的点云分析方法,包括以下步骤:The technical solution for realizing the object of the present invention is: a point cloud analysis method based on direction-induced convolution, comprising the following steps:
步骤1、构建方向诱导卷积模块,所述方向诱导卷积模块用于提取点云无序邻域的特征;Step 1. Constructing a direction-induced convolution module, the direction-induced convolution module is used to extract the features of the disordered neighborhood of the point cloud;
步骤2、基于方向诱导卷积模块,构建残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块;Step 2. Based on the direction-induced convolution module, construct the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module;
步骤3、根据残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块构建方向诱导卷积网络;Step 3. Construct a direction-induced convolution network according to the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module;
步骤4:将点云数据输入方向诱导卷积网络,获得点云分割结果和分类结果。Step 4: Input the point cloud data into the direction-induced convolution network to obtain point cloud segmentation results and classification results.
优选地,所述方向诱导卷积模块用于提取点云无序邻域的特征的具体步骤为:Preferably, the specific steps that the direction-induced convolution module is used to extract the features of the disordered neighborhood of the point cloud are:
计算点云局部邻域点的空间方向特征;Calculate the spatial direction features of the local neighborhood points of the point cloud;
计算邻域点与空间方向集各项之间的空间方向相似性权重并归一化;Calculate and normalize the spatial direction similarity weight between the neighborhood points and the items of the spatial direction set;
将点云无序邻域点的特征转换到方向集空间中;Convert the features of point cloud disordered neighborhood points into the direction set space;
在方向集空间中,使用标准卷积对投影到空间方向集各项上的空间转换特征进行编码,以完成对点云无序邻域的特征提取。In the orientation set space, standard convolution is used to encode the spatially transformed features projected onto the items of the spatial orientation set to complete feature extraction of the unordered neighborhood of the point cloud.
优选地,空间方向相似性权重的计算公式为:Preferably, the calculation formula of the spatial direction similarity weight is:
式中,为每个点的领域点的空间方向特征,每项xm∈D为3D空间中的一个方向。In the formula, is the spatial orientation feature of the domain point for each point, and each term x m ∈ D is an orientation in 3D space.
优选地,每个点pi的邻域点的空间方向特征编码为:Preferably, the neighborhood points of each point p i The spatial orientation characteristics of Encoded as:
优选地,特征转换表示为:Preferably, the feature transformation is expressed as:
其中,是邻域点的特征;代表串接;表示归一化的空间方向相似性;表示邻域点的相对位置编码,N(pi)表示以点pi为中心的局部点云区域。in, is the neighborhood point Characteristics; represents concatenation; represents the normalized spatial orientation similarity; Represents a neighborhood point The relative position encoding of , N( pi ) represents the local point cloud area centered on point pi .
优选地,使用标准卷积对投影到空间方向集各项上的空间转换特征进行编码的具体过程为:Preferably, the specific process of using standard convolution to encode the spatial transformation features projected onto the items of the spatial direction set is as follows:
式中,ψconv表示对转换特征做卷积,w1,……,wM表示不同的可学习参数,表示所有邻域点特征在方向集D中各项xm上投影的加权和。In the formula, ψ conv represents the convolution of the conversion feature, w 1 ,...,w M represents different learnable parameters, Represents the weighted sum of the projections of all neighborhood point features on each item x m in the direction set D.
优选地,所述方向集空间的生成方法为:Preferably, the method for generating the direction set space is:
在点云数据中采样S组点云;Sampling S groups of point clouds in the point cloud data;
对于每个采样点云Ps={ps,i∈R3|i=1,……,N},模拟方向诱导卷积网络编码阶段中各残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块的最远点采样和邻域构建过程,计算各层所有采样点云的空间方向特征 For each sampled point cloud P s ={ps ,i ∈R 3 |i=1,...,N}, simulate the direction-induced convolutional network coding stage of each residual direction-induced convolution module and the farthest point sampling The residual direction induces the farthest point sampling and neighborhood construction process of the convolution module, and calculates the spatial direction features of all sampled point clouds at each layer
将S组点云相应层的所有空间方向特征合并得到使用K均值算法对每一层的空间方向特征进行聚类,生成方向诱导卷积网络各层的空间方向集Dl。Combining all the spatial direction features of the corresponding layers of the S groups of point clouds to obtain Use the K-means algorithm to analyze the spatial orientation features of each layer Clustering is performed to generate a spatial orientation set D l of each layer of the orientation-inducing convolutional network.
优选地,所述残差方向诱导卷积模块包括第一多层感知机、方向诱导卷积分支、第二多层感知机以及第三多层感知机;所述第一多层感知机用于提取输入点云的逐点特征,并分别输入方向诱导卷积分支和第二多层感知机;Preferably, the residual direction-induced convolution module includes a first multi-layer perceptron, a direction-induced convolution branch, a second multi-layer perceptron and a third multi-layer perceptron; the first multi-layer perceptron is used for Extract the point-wise features of the input point cloud and input the direction-induced convolution branch and the second multilayer perceptron respectively;
所述方向诱导卷积分支包括第一方向诱导卷积模块、第四多层感知机,所述第一方向诱导卷积用于提取点云无序邻域的特征,所述第四多层感知机用于升高特征维度;所述第二多层感知机用于升高特征维度,使其和方向诱导卷积分支输出的特征维度相同;方向诱导卷积分支、第二多层感知机输出特征串接后经第三多层感知机进行特征融合,得到残差方向诱导卷积模块的输出。The direction-induced convolution branch includes a first direction-induced convolution module and a fourth multi-layer perceptron. The first direction-induced convolution is used to extract features of the disordered neighborhood of point clouds. The fourth multi-layer perceptron The second multi-layer perceptron is used to increase the feature dimension so that it is the same as the feature dimension output by the direction-induced convolution branch; the direction-induced convolution branch and the second multi-layer perceptron output After the features are concatenated, the third multi-layer perceptron performs feature fusion to obtain the output of the residual direction induced convolution module.
优选地,所述最远点采样残差方向诱导卷积模块包括最远点采样模块、第五多层感知机、方向诱导卷积分支、最大池化分支和第八多层感知机,所述最远点采样模块用于对输入的点云进行降采样;所述第五多层感知机用于提取逐点特征分别输入方向诱导卷积分支和最大池化分支;所述方向诱导卷积分支包括第二方向诱导卷积、第六多层感知机,所述最大池化分支包括最大池化层、第七多层感知机;所述第二方向诱导卷积用于提取点云无序邻域的特征,所述最大池化分支用于聚合局部特征,所述第六多层感知机、第七多层感知机均用于升高特征维度,并使两分支输出特征的维度相等;两个分支输出的特征串接后经第八多层感知机进行特征融合,得到最远点采样残差方向诱导卷积模块的输出。Preferably, the farthest point sampling residual direction-induced convolution module includes a farthest point sampling module, a fifth multi-layer perceptron, a direction-induced convolution branch, a max-pooling branch and an eighth multi-layer perceptron, and the The farthest point sampling module is used to downsample the input point cloud; the fifth multi-layer perceptron is used to extract point-by-point features and input the direction-induced convolution branch and the maximum pooling branch respectively; the direction-induced convolution branch It includes a second-direction induced convolution and a sixth multi-layer perceptron, and the maximum pooling branch includes a maximum-pooling layer and a seventh multi-layer perceptron; the second-direction induced convolution is used to extract point cloud disordered neighbors Domain features, the max pooling branch is used to aggregate local features, the sixth and seventh multilayer perceptrons are used to increase the feature dimension, and make the dimensions of the output features of the two branches equal; The features output by the branches are concatenated and then fused by the eighth multi-layer perceptron, and the output of the convolution module induced by the direction of the farthest point sampling residual is obtained.
本发明与现有技术相比,其显著优点为:1)本发明更好地捕获了点云的局部空间结构,其中每个点的邻域信息都能投射到规范有序的方向集空间;2)本发明通过交替堆叠残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块,构建了一个方向诱导卷积网络,以端到端的方式进行点云分析,提高了点云分类以及点云分割的准确率;3)本发明在点云分类以及语义分割任务上都取得了较优的性能。Compared with the prior art, the present invention has the following significant advantages: 1) the present invention better captures the local spatial structure of the point cloud, wherein the neighborhood information of each point can be projected into a standardized and ordered direction set space; 2) The present invention constructs a direction-induced convolution network by alternately stacking the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module, and analyzes the point cloud in an end-to-end manner, improving the performance of the point cloud. Classification and accuracy of point cloud segmentation; 3) The present invention achieves better performance in both point cloud classification and semantic segmentation tasks.
附图说明Description of drawings
图1为本发明方向诱导卷积的结构示意图。FIG. 1 is a schematic structural diagram of the direction-induced convolution of the present invention.
图2为本发明残差方向诱导卷积模块的结构示意图。FIG. 2 is a schematic structural diagram of a residual direction induced convolution module of the present invention.
图3为本发明最远点采样残差方向诱导卷积模块的结构示意图。FIG. 3 is a schematic structural diagram of the farthest point sampling residual direction induced convolution module of the present invention.
图4为本发明方向诱导卷积网络的结构示意图。FIG. 4 is a schematic structural diagram of a direction-inducing convolutional network according to the present invention.
具体实施方式Detailed ways
下面结合附图和具体实施例,进一步说明本发明方案。The solution of the present invention will be further described below with reference to the accompanying drawings and specific embodiments.
如图1~4所示,一种基于方向诱导卷积的点云分析方法,包括如下步骤:As shown in Figures 1-4, a point cloud analysis method based on direction-induced convolution includes the following steps:
步骤1、构建方向诱导卷积模块,所述方向诱导卷积模块用于提取点云无序邻域的特征;Step 1. Constructing a direction-induced convolution module, the direction-induced convolution module is used to extract the features of the disordered neighborhood of the point cloud;
进一步的实施例中,提取点云无序邻域的特征的具体过程为:In a further embodiment, the specific process of extracting the features of the disordered neighborhood of the point cloud is:
计算点云局部邻域点的空间方向特征;Calculate the spatial direction features of the local neighborhood points of the point cloud;
使用表示一组点云输入,使用F={fi|i=1,……,N}∈Rd表示其对应特征集,其中N表示点的数量,pi表示三维坐标(x,y,z),fi表示附加特征(例如RGB颜色、曲面法线、激光反射强度等)。对于每个点pi∈P,使用N(pi)表示一个以pi为中心的局部点云区域,其中r∈R为选定的半径。对每个点pi的邻域点其空间方向特征编码如下:use represents a set of point cloud inputs, and uses F={fi | i =1,...,N}∈R d to represent its corresponding feature set, where N represents the number of points, and p i represents the three-dimensional coordinates (x, y, z ), f i represents additional features (such as RGB color, surface normal, laser reflection intensity, etc.). For each point p i ∈ P, use N( pi ) to denote a local point cloud region centered on p i , where r∈R is the selected radius. Neighborhood points for each point pi its spatial orientation The encoding is as follows:
计算邻域点与空间方向集各项之间的空间方向相似性权重并归一化;Calculate and normalize the spatial direction similarity weight between the neighborhood points and the items of the spatial direction set;
使用表示一个含有M项的空间方向集,其中每项xm∈D代表3D空间中的一个方向。对每个邻域点计算其空间方向特征与空间方向集中各项xm的相似性,以获得每个邻域点在xm方向的相似性权重该相似性权重计算过程表示为以下形式:use represents a set of spatial directions with M terms, where each term x m ∈ D represents a direction in 3D space. for each neighborhood point Calculate its spatial orientation feature similarity to the terms x m in the set of spatial orientations to obtain each neighborhood point Similarity weight in x m direction The similarity weight calculation process is expressed in the following form:
将邻域点与方向集D各项所计算得到的相似性权重进行归一化,这里使用softmax进行归一化,以使得相似性权重更加稀疏,该相似性权重归一化计算过程表示为以下形式:the neighborhood point It is normalized with the similarity weight calculated by each item of the direction set D. Here, softmax is used for normalization to make the similarity weight more sparse. The calculation process of the similarity weight normalization is expressed as the following form:
其中表示每个邻域点在xm项上的归一化相似性权重。in represents each neighborhood point Normalized similarity weights over the x m terms.
进行特征空间转换,将点云无序邻域点的特征转换到规范有序的方向集空间中,具体为:Perform feature space conversion to convert the features of point cloud disordered neighborhood points into a standardized and ordered direction set space, specifically:
基于归一化相似性权重,将点pi的所有邻域点的特征投影到规范有序的方向集空间D中,该过程称为特征空间转换,表示为以下形式:Based on normalized similarity weights, all neighbor points of point pi are The features of are projected into a canonical ordered set of directions space D, a process called feature space transformation, expressed in the following form:
其中是邻域点的特征;代表串接;表示该点的相对位置编码,表示为:in is the neighborhood point Characteristics; represents concatenation; Represents the relative position code of the point, expressed as:
特征空间转换得到的表示所有邻域点特征在方向集D中各项xm上投影的加权和。至此,点pi的所有无序邻域点的特征都被转换到规则有序的方向集空间中。feature space transformation Represents the weighted sum of the projections of all neighborhood point features on each item x m in the direction set D. So far, all the unordered neighbor points of point p i The features are transformed into a regular and ordered direction set space.
在方向集空间中,使用标准卷积对投影到空间方向集各项上的空间转换特征进行编码,以完成对点云无序邻域的特征提取,体现对点云无序局部区域更加鲁棒的表示,具体过程表示如下:In the direction set space, standard convolution is used to encode the spatial transformation features projected onto the items of the spatial direction set to complete the feature extraction of the disordered neighborhood of the point cloud, which is more robust to the disordered local area of the point cloud. The specific process is expressed as follows:
其中,ψconv表示对转换特征做卷积,w1,……,wM表示不同的可学习参数。Among them, ψ conv represents the convolution of the transformation feature, and w 1 ,...,w M represents different learnable parameters.
步骤2、基于方向诱导卷积模块,构建残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块,保留更丰富的点云特征信息;Step 2. Based on the direction-induced convolution module, construct the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module to retain richer point cloud feature information;
如图2所示,进一步的实施例中,所述残差方向诱导卷积模块包括第一多层感知机、方向诱导卷积分支、第二多层感知机以及第三多层感知机;所述第一多层感知机用于提取输入点云的逐点特征,然后分别输入方向诱导卷积分支和第二多层感知机,As shown in FIG. 2, in a further embodiment, the residual direction-induced convolution module includes a first multilayer perceptron, a direction-induced convolution branch, a second multilayer perceptron, and a third multilayer perceptron; The first multilayer perceptron is used to extract the point-wise features of the input point cloud, and then the direction-induced convolution branch and the second multilayer perceptron are respectively input,
所述方向诱导卷积分支包括第一方向诱导卷积模块、第四多层感知机,所述第一方向诱导卷积用于提取点云无序邻域的特征,所述第四多层感知机用于升高特征维度,以保留更多特征信息;所述第二多层感知机用于升高特征维度,使其和方向诱导卷积分支输出的特征维度相同。最后将两个分支的输出特征串接,经过第三多层感知机进行特征融合,得到残差方向诱导卷积模块的输出。The direction-induced convolution branch includes a first direction-induced convolution module and a fourth multi-layer perceptron. The first direction-induced convolution is used to extract features of the disordered neighborhood of point clouds. The fourth multi-layer perceptron The second multi-layer perceptron is used to increase the feature dimension so that it is the same as the feature dimension output by the direction-induced convolution branch. Finally, the output features of the two branches are concatenated, and the third multi-layer perceptron is used for feature fusion to obtain the output of the residual direction induced convolution module.
如图3所示,进一步的实施例中,所述最远点采样残差方向诱导卷积模块包括最远点采样模块、第五多层感知机、方向诱导卷积分支、最大池化分支和第八多层感知机。首先使用最远点采样模块对输入的点云进行降采样,减少点的数量,进行多尺度分析,然后使用第五多层感知机提取逐点特征,之后分别输入方向诱导卷积分支和最大池化分支;所述方向诱导卷积分支包括第二方向诱导卷积、第六多层感知机,所述最大池化分支包括最大池化层、第七多层感知机。所述第二方向诱导卷积用于提取点云无序邻域的特征,在最大池化分支中使用最大池化操作来聚合局部特征,然后两个分支均使用第六多层感知机、第七多层感知机升高特征维度,以保留更多特征信息,并使两分支输出特征的维度相等,最后将两个分支输出的特征串接,经过第八多层感知机进行特征融合,得到最远点采样残差方向诱导卷积模块的输出。As shown in FIG. 3 , in a further embodiment, the farthest point sampling residual direction-induced convolution module includes a farthest point sampling module, a fifth multilayer perceptron, a direction-induced convolution branch, a maximum pooling branch and Eighth Multilayer Perceptron. First use the farthest point sampling module to downsample the input point cloud, reduce the number of points, and perform multi-scale analysis, then use the fifth multilayer perceptron to extract point-by-point features, and then input the direction-induced convolution branch and max pooling respectively. The direction-induced convolution branch includes a second direction-induced convolution and a sixth multi-layer perceptron, and the max-pooling branch includes a max-pooling layer and a seventh multi-layer perceptron. The second-direction induced convolution is used to extract the features of the point cloud disordered neighborhood, and the max-pooling operation is used in the max-pooling branch to aggregate local features, and then both branches use the sixth multilayer perceptron, the first The seven multi-layer perceptron increases the feature dimension to retain more feature information and make the dimensions of the output features of the two branches equal. Finally, the features output from the two branches are concatenated, and the eighth multi-layer perceptron performs feature fusion to obtain The furthest point samples the residual direction to induce the output of the convolution module.
步骤3、根据残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块构建方向诱导卷积网络,用于捕获输入点云的分层特征,以端到端的方式进行点云分析。Step 3. According to the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module, a direction-induced convolution network is constructed to capture the hierarchical features of the input point cloud and perform point cloud analysis in an end-to-end manner .
如图4所示,方向诱导卷积网络包含点云分割子网络和分类子网络。As shown in Figure 4, the direction-inducing convolutional network consists of a point cloud segmentation sub-network and a classification sub-network.
点云分割子网络和分类子网络使用相同的编码器对输入点云进行特征编码。所述编码器由残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块交替连接形成,包括第一残差方向诱导卷积模块、第一最远点采样残差方向诱导卷积模块、第二残差方向诱导卷积模块、第二最远点采样残差方向诱导卷积模块,第三残差方向诱导卷积模块。输入点云P经第一残差方向诱导卷积模块提取特征后,输入第一最远点采样残差方向诱导卷积模块,得到降采样点云P1及其特征。同理,第一最远点采样残差方向诱导卷积模块的输出P1经第二残差方向诱导卷积模块提取特征后,输入第二最远点采样残差方向诱导卷积模块,得到降采样点云P2及其特征。第一残差方向诱导卷积模块,第一最远点采样残差方向诱导卷积模块,第二残差方向诱导卷积模块,第二最远点采样残差方向诱导卷积模块,第三残差方向诱导卷积模块交替连接,分层提取点云局部邻域特征,以进行多尺度分析。The point cloud segmentation sub-network and the classification sub-network use the same encoder to encode the features of the input point cloud. The encoder is formed by alternately connecting the residual direction induced convolution module and the farthest point sampling residual direction induced convolution module, including the first residual direction induced convolution module and the first farthest point sampling residual direction induced convolution volume. The product module, the second residual direction induced convolution module, the second farthest point sampling residual direction induced convolution module, and the third residual direction induced convolution module. After the input point cloud P is extracted with features by the first residual direction induced convolution module, input the first farthest point sampling residual direction induced convolution module to obtain the down-sampling point cloud P 1 and its features. Similarly, after the output P 1 of the first farthest point sampling residual direction induced convolution module is extracted features through the second residual direction induced convolution module, it is input into the second farthest point sampling residual direction induced convolution module to obtain: Downsampled point cloud P2 and its features. The first residual direction induced convolution module, the first farthest point sampling residual direction induced convolution module, the second residual direction induced convolution module, the second farthest point sampling residual direction induced convolution module, the third The residual direction induces the convolution modules to be connected alternately, and the local neighborhood features of the point cloud are extracted hierarchically for multi-scale analysis.
点云分割子网络的解码器包括:第一上采样、第二上采样、第九多层感知机、第十多层感知机和第一全连接层。解码器采用K近邻插值对编码器编码的降采样点云进行上采样,逐层恢复点云数量,并传播点的特征。同时,采用跨连接和多层感知机来结合编码器和解码器中相应层的特征。具体来说,编码器中第三残差方向诱导卷积模块的输出P2经过第一上采样后得到上采样点云P1'及其特征,P1'点云数量与第二残差方向诱导卷积模块的输出点云P1一致,将P1,P1'点云的特征进行串接,然后使用第九多层感知机进行特征融合得到新的P1点云及其特征。同理,对第九多层感知机的输出点云P1进行上采样得到点云P'及其特征,使用第十多层感知机将点云P'和第一残差方向诱导卷积模块的输出点云P进行特征融合,得到新的P点云及其特征。最后,使用第一全连接层预测每个点的标签,得到分割结果。The decoder of the point cloud segmentation sub-network includes: first upsampling, second upsampling, ninth multilayer perceptron, tenth multilayer perceptron and first fully connected layer. The decoder uses K-nearest neighbor interpolation to upsample the downsampled point cloud encoded by the encoder, recovers the number of point clouds layer by layer, and propagates the features of the points. At the same time, cross-connection and multi-layer perceptrons are adopted to combine the features of corresponding layers in the encoder and decoder. Specifically, the output P 2 of the third residual direction induced convolution module in the encoder is subjected to the first up-sampling to obtain the up-sampled point cloud P 1 ' and its characteristics, the number of point clouds in P 1 ' and the second residual direction The output point cloud P 1 of the induced convolution module is consistent, the features of the P 1 and P 1 ' point clouds are concatenated, and then the ninth multi-layer perceptron is used for feature fusion to obtain a new P 1 point cloud and its features. Similarly, the output point cloud P1 of the ninth multilayer perceptron is upsampled to obtain the point cloud P' and its features, and the tenth multilayer perceptron is used to induce the convolution module of the point cloud P' and the first residual direction. The output point cloud P of P is feature fusion to obtain a new P point cloud and its features. Finally, the first fully connected layer is used to predict the label of each point to get the segmentation result.
点云分类子网络的解码器包括全局最大池化、第二全连接层。对编码器中第三残差方向诱导卷积模块的输出进行全局最大池化,得到整个点云的全局特征。最后,使用第二全连接层预测全局特征的标签,得到分类结果。The decoder of the point cloud classification sub-network includes global max pooling and a second fully connected layer. The output of the third residual direction-induced convolution module in the encoder is globally max-pooled to obtain the global features of the entire point cloud. Finally, the labels of the global features are predicted using the second fully connected layer to obtain the classification results.
步骤4:将点云数据输入方向诱导卷积网络,获得点云分割结果和分类结果。Step 4: Input the point cloud data into the direction-induced convolution network to obtain point cloud segmentation results and classification results.
在点云分割和分类数据集上训练方向诱导卷积网络,使用自适应矩估计(Adam)优化器,学习率为0.001,衰减率为0.7,同时使用交叉熵损失对网络参数进行优化。网络中所有多层感知机都是由1×1卷积,批量归一化(BN)和非线性激活(Relu)串接构成。测试阶段,将点云数据输入训练好的方向诱导卷积网络,获得点云分割结果和分类结果。Orientation-inducing convolutional networks are trained on point cloud segmentation and classification datasets using an Adaptive Moment Estimation (Adam) optimizer with a learning rate of 0.001 and a decay rate of 0.7, while the network parameters are optimized using a cross-entropy loss. All multilayer perceptrons in the network are composed of 1×1 convolution, batch normalization (BN) and nonlinear activation (Relu) concatenated. In the testing phase, the point cloud data is input into the trained direction-induced convolutional network to obtain point cloud segmentation results and classification results.
进一步的实施例中,所述空间方向集的生成方法为:In a further embodiment, the method for generating the spatial direction set is:
根据不同点云数据集,在训练点云数据中采样S组点云,即{Ps},s=1,……,S。对于每个采样点云Ps={ps,i∈R3|i=1,……,N},模拟方向诱导卷积网络编码阶段中各残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块的最远点采样和邻域构建过程,计算各层所有采样点云的空间方向特征将S组点云相应层的所有空间方向特征合并得到使用K均值算法对每一层的空间方向特征进行聚类,生成方向诱导卷积网络各层的空间方向集Dl。According to different point cloud data sets, S groups of point clouds are sampled in the training point cloud data, namely {P s }, s=1,...,S. For each sampled point cloud P s ={ps ,i ∈R3|i=1,...,N}, each residual direction-induced convolution module and the farthest point sampling residual in the simulation direction-induced convolutional network coding stage The farthest point sampling and neighborhood construction process of the differential direction induced convolution module, and the spatial direction characteristics of all sampled point clouds of each layer are calculated Combining all the spatial direction features of the corresponding layers of the S groups of point clouds to obtain Use the K-means algorithm to analyze the spatial orientation features of each layer Clustering is performed to generate the spatial direction set D1 of each layer of the direction-inducing convolutional network.
本发明利用新的方向诱导卷积来提高点云分析的性能。对密集点云来说,局部邻域的大多数点具有相似特征,例如空间位置和颜色信息,因此很难利用这些特征对邻域点进行稀疏量化。由于点云局部邻域点在中心点的不同方向,它们的空间方向蕴含大量有用的信息。为此,本发明提出利用点云局部邻域点的空间方向特征对其进行稀疏量化,使用方向诱导卷积将无序邻域点的特征投影到规范有序的方向集空间并提取特征。基于方向诱导卷积构建了残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块,得到更丰富的点云表示。通过交替堆叠残差方向诱导卷积模块和最远点采样残差方向诱导卷积模块,以提取点云分层特征,构建了具有广泛使用的跨连接编解码器体系结构方向诱导卷积网络用以完成点云分类和分割任务。The present invention utilizes a new direction-induced convolution to improve the performance of point cloud analysis. For dense point clouds, most points in the local neighborhood have similar features, such as spatial location and color information, so it is difficult to use these features to sparse quantization of neighborhood points. Since the local neighborhood points of the point cloud are in different directions of the center point, their spatial directions contain a lot of useful information. To this end, the present invention proposes to sparse and quantify the spatial direction features of the local neighborhood points of the point cloud, and use direction-induced convolution to project the features of the disordered neighborhood points into a canonical and ordered direction set space and extract the features. Based on the direction-induced convolution, the residual direction-induced convolution module and the farthest point sampling residual direction-induced convolution module are constructed to obtain a richer point cloud representation. By alternately stacking residual direction-induced convolution modules and farthest point sampling residual direction-induced convolution modules to extract point cloud hierarchical features, a direction-induced convolutional network with a widely used cross-connection encoder-decoder architecture is constructed for to complete point cloud classification and segmentation tasks.
实施例Example
为了验证本发明方案的有效性,进行如下实验。In order to verify the effectiveness of the scheme of the present invention, the following experiments were carried out.
为验证方向诱导卷积方法的有效性,对方向诱导卷积网络在各项任务(如点云分类、零件分割、大规模场景分割)上的性能进行评估。完成不同的任务时,使用不同的网络配置。对于点云分类任务,交替堆叠3个残差方向诱导卷积模块和2个最远点采样残差方向诱导卷积模块;对于形状零件分割任务,交替堆叠4个残差方向诱导卷积模块和3个最远点采样残差方向诱导卷积模块;对于语义分割任务,交替堆叠5个残差方向诱导卷积模块和4个最远点采样残差方向诱导卷积模块。在所有分类和分割任务中,设置自适应矩估计(Adam)优化器的学习率为0.001,衰减率为0.7,同时利用交叉熵损失对网络参数进行优化。网络中所有多层感知机都是由1×1卷积,批量归一化(BN)和非线性激活(Relu)串接构成。方向集的项数设置为15(即M=15)。所有实验在单块Titan Xp GPU上进行。To verify the effectiveness of the orientation-induced convolution method, the performance of the orientation-induced convolutional network on various tasks (such as point cloud classification, part segmentation, large-scale scene segmentation) is evaluated. Use different network configurations for different tasks. For the point cloud classification task, 3 residual direction induced convolution modules and 2 residual direction induced convolution modules are alternately stacked; for the shape part segmentation task, 4 residual direction induced convolution modules and 3 farthest point sampling residual direction induced convolution modules; for semantic segmentation tasks, 5 residual direction induced convolution modules and 4 farthest point sampling residual direction induced convolution modules are alternately stacked. In all classification and segmentation tasks, the learning rate of the Adaptive Moment Estimation (Adam) optimizer is set to 0.001 and the decay rate is 0.7, while the network parameters are optimized with cross-entropy loss. All multilayer perceptrons in the network are composed of 1×1 convolution, batch normalization (BN) and nonlinear activation (Relu) concatenated. The number of items in the direction set is set to 15 (ie, M=15). All experiments are performed on a single Titan Xp GPU.
表1:ModelNet40数据集上的形状分类性能比较Table 1: Comparison of shape classification performance on ModelNet40 dataset
对于点云分类任务,在ModelNet40数据集上评估本发明。ModelNet40包括40类物体,其中9843个用于训练,2468个用于测试。将ModelNet40的点云数据作为输入,其中每个点云均匀采样1024个点。实验仅使用采样点的几何坐标(x,y,z)。将该坐标归一化使3D目标转换到单位球中。同时,使用了简单的点云数据增强技术,包括对原始点云进行随机缩放、平移、和扰动。最后,对每个测试实例进行10次测试,计算10次的平均值作为最终的预测值,并使用40个类别的总体精度(OA)来评估不同方法的性能。将本发明与几种最新的方法进行了比较(根据输入点云的数量以及是否具有额外特征分为两组,xyz表示三维坐标,nor表示法向量),结果如表1所示。For the point cloud classification task, the present invention is evaluated on the ModelNet40 dataset. ModelNet40 includes 40 classes of objects, of which 9843 are used for training and 2468 are used for testing. The point cloud data of ModelNet40 is taken as input, where each point cloud is uniformly sampled with 1024 points. The experiments only use the geometric coordinates (x, y, z) of the sampling points. Normalizing the coordinates transforms the 3D object into the unit sphere. At the same time, simple point cloud data augmentation techniques are used, including random scaling, translation, and perturbation of the original point cloud. Finally, each test instance is tested 10 times, the average of the 10 times is calculated as the final predicted value, and the overall accuracy (OA) of the 40 classes is used to evaluate the performance of different methods. The present invention is compared with several state-of-the-art methods (divided into two groups according to the number of input point clouds and whether they have additional features, xyz denotes 3D coordinates, nor denotes normal vectors), and the results are shown in Table 1.
本发明在点云分类任务中得到OA值为91.5%,取得了显著的效果。相比于其他以点云三维坐标为输入的方法,所提出的方向诱导卷积网络取得了更优异的结果。与其他利用更多输入信息的方法相比,所提出的方法仍然具有竞争力。实验结果表明,方向诱导卷积具有对整个点云建模的能力。The invention obtains an OA value of 91.5% in the point cloud classification task, and achieves a remarkable effect. Compared with other methods that take 3D coordinates of point cloud as input, the proposed orientation-induced convolutional network achieves superior results. The proposed method is still competitive compared to other methods that utilize more input information. The experimental results show that the orientation-induced convolution has the ability to model the entire point cloud.
表2:ShapeNet零件数据集上的形状零件分割性能比较Table 2: Comparison of shape part segmentation performance on ShapeNet parts dataset
零件分割是形状分析中一项具有挑战性的任务。为此,在ShapeNet零件数据集上评估本发明方案的有效性。该数据集包含16881个三维对象,涵盖16个形状类别。每个点云实例都有2到6个零件标签注释,总共有50个零件标签。随机选择2048个点作为输入,并在最后一个特征层中串接对象标签的独热(one-hot)编码。实验采用点云零件分割结果的类平均交并比(mcIoU)和实例平均交并比(mIoU)作为评估指标。Part segmentation is a challenging task in shape analysis. To this end, the effectiveness of the inventive scheme is evaluated on the ShapeNet parts dataset. The dataset contains 16,881 3D objects covering 16 shape categories. Each point cloud instance has 2 to 6 part label annotations, for a total of 50 part labels. 2048 points are randomly selected as input and one-hot encoding of object labels is concatenated in the last feature layer. In the experiment, the class average intersection ratio (mcIoU) and instance average intersection ratio (mIoU) of the segmentation results of point cloud parts are used as evaluation indicators.
如表2所示,本发明在形状零件分割任务上的性能优于最先进的方法。达到82.9%mcIoU和85.7%mIoU。所有的实验结果充分表明,本发明能够达到更为优异的性能,这也证明了所提出的方向诱导卷积对于零件建模任务的有效性。As shown in Table 2, the present invention outperforms the state-of-the-art methods on the task of segmentation of shape parts. 82.9% mcIoU and 85.7% mIoU were achieved. All experimental results fully demonstrate that the present invention can achieve more excellent performance, which also proves the effectiveness of the proposed direction-induced convolution for part modeling tasks.
在大规模室内数据集S3DIS上评估本发明。S3DIS数据集包含来自3个不同建筑群的6个大型室内区域的2.73亿个点,每个点都有13个类别的语义标签标注。其中Area-5场景用于测试,其他场景用于训练。在训练和测试数据集的准备中,将每个房间划分为多个区块,每块的大小为1m×1m,步长为0.5m。横跨六个区域的数据集共包含23585个区块。在每个区块中,采样4096个点进行训练。每个点都用一个9维向量(XYZ、RGB和归一化的XYZ)表示。实验采用13个类的平均交并比(mIoU)、平均类精度(mAcc)和总体准确率(OA)作为评估指标。The present invention is evaluated on the large-scale indoor dataset S3DIS. The S3DIS dataset contains 273 million points from 6 large indoor areas in 3 different building complexes, each of which is annotated with semantic labels for 13 categories. The Area-5 scene is used for testing, and the other scenes are used for training. In the preparation of training and testing datasets, each room is divided into multiple blocks, each with a size of 1m × 1m and a step size of 0.5m. The dataset spanning six regions contains a total of 23,585 blocks. In each block, 4096 points are sampled for training. Each point is represented by a 9-dimensional vector (XYZ, RGB and normalized XYZ). The experiments use the mean intersection ratio (mIoU), mean class accuracy (mAcc), and overall accuracy (OA) of 13 classes as evaluation metrics.
表3:S3DISArea-5数据集上的语义分割性能比较Table 3: Comparison of semantic segmentation performance on S3DISArea-5 dataset
定量结果如表3所示,本发明在三个度量指标中都取得了最优的效果。同时,模型的mIoU得分显著超过了PointNet 21.4%、PointCNN 5.3%。与之前性能最好的点云语义分割方法相比,在OA、mAcc和mIoU上分别提高了1.1%、2.2%和2.2%。实验表明,方向诱导卷积能够更好地对点云的语义进行建模。此外,模型在墙、窗、门和书柜类别上都达到了最先进的性能。通常窗户、门和书柜嵌在墙上难以区分,而方向诱导卷积网络仍然可以进行准确的分割,得到更好的效果。这表明方向诱导卷积可以捕捉局部区域的细微差异,准确地分割复杂场景,这进一步验证了方向诱导卷积对复杂场景语义进行建模的能力。The quantitative results are shown in Table 3, and the present invention has achieved the best results in all three metrics. At the same time, the mIoU score of the model significantly exceeds PointNet by 21.4% and PointCNN by 5.3%. Compared with the previous best-performing point cloud semantic segmentation method, it achieves 1.1%, 2.2% and 2.2% improvement on OA, mAcc and mIoU, respectively. Experiments show that direction-induced convolution can better model the semantics of point clouds. Additionally, the models achieve state-of-the-art performance in the wall, window, door, and bookcase categories. Usually windows, doors and bookcases are embedded in the wall and indistinguishable, and the direction-induced convolutional network can still perform accurate segmentation and get better results. This shows that direction-induced convolution can capture subtle differences in local regions and accurately segment complex scenes, which further validates the ability of direction-induced convolution to model complex scene semantics.
上述实施例仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和等同替换,这些对本发明权利要求进行改进和等同替换后的技术方案,均落入本发明的保护范围。The above-mentioned embodiments are only preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and equivalent replacements can also be made. The technical solutions required to be improved and equivalently replaced all fall within the protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011436923.3A CN112488117B (en) | 2020-12-11 | 2020-12-11 | Point cloud analysis method based on direction-induced convolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011436923.3A CN112488117B (en) | 2020-12-11 | 2020-12-11 | Point cloud analysis method based on direction-induced convolution |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112488117A CN112488117A (en) | 2021-03-12 |
CN112488117B true CN112488117B (en) | 2022-09-27 |
Family
ID=74940993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011436923.3A Active CN112488117B (en) | 2020-12-11 | 2020-12-11 | Point cloud analysis method based on direction-induced convolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112488117B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114638985A (en) * | 2022-03-03 | 2022-06-17 | 北京中关村智连安全科学研究院有限公司 | Electric power tower point cloud classification segmentation model construction method based on core point convolution |
CN114926647B (en) * | 2022-05-20 | 2024-06-07 | 上海人工智能创新中心 | Point cloud identification method, device, equipment and computer readable storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110120097A (en) * | 2019-05-14 | 2019-08-13 | 南京林业大学 | Airborne cloud Semantic Modeling Method of large scene |
CN111862101A (en) * | 2020-07-15 | 2020-10-30 | 西安交通大学 | A Semantic Segmentation Method of 3D Point Clouds from the Coding Perspective of Bird's Eye View |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2404784B (en) * | 2001-03-23 | 2005-06-22 | Thermo Finnigan Llc | Mass spectrometry method and apparatus |
-
2020
- 2020-12-11 CN CN202011436923.3A patent/CN112488117B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110120097A (en) * | 2019-05-14 | 2019-08-13 | 南京林业大学 | Airborne cloud Semantic Modeling Method of large scene |
CN111862101A (en) * | 2020-07-15 | 2020-10-30 | 西安交通大学 | A Semantic Segmentation Method of 3D Point Clouds from the Coding Perspective of Bird's Eye View |
Also Published As
Publication number | Publication date |
---|---|
CN112488117A (en) | 2021-03-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110532859B (en) | Remote sensing image target detection method based on deep evolutionary pruning convolutional network | |
CN110728192B (en) | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network | |
CN113469094A (en) | Multi-mode remote sensing data depth fusion-based earth surface coverage classification method | |
CN113379646B (en) | An Algorithm for Dense Point Cloud Completion Using Generative Adversarial Networks | |
CN115035131A (en) | Unmanned aerial vehicle remote sensing image segmentation method and system of U-shaped self-adaptive EST | |
CN113779675A (en) | Physical-data-driven intelligent shear wall architectural design method and device | |
CN117237559B (en) | Digital twin city-oriented three-dimensional model data intelligent analysis method and system | |
CN113674334A (en) | Texture recognition method based on deep self-attention network and local feature encoding | |
CN112488117B (en) | Point cloud analysis method based on direction-induced convolution | |
CN116704137A (en) | A deep learning reverse modeling method for offshore oil drilling platform point cloud | |
CN113870422A (en) | Pyramid Transformer-based point cloud reconstruction method, device, equipment and medium | |
CN114707011A (en) | A feature fusion method for multi-source heterogeneous data based on tensor decomposition | |
CN113256543A (en) | Point cloud completion method based on graph convolution neural network model | |
CN111079851B (en) | Vehicle type identification method based on reinforcement learning and bilinear convolution network | |
CN114463511A (en) | 3D human body model reconstruction method based on Transformer decoder | |
CN116168046B (en) | 3D point cloud semantic segmentation method, system, medium and device under complex environment | |
CN116682021A (en) | A Method for Extracting Building Vector Outline Data from High Resolution Remote Sensing Image | |
CN116402766A (en) | Remote sensing image change detection method combining convolutional neural network and transducer | |
CN116740527A (en) | Remote sensing image change detection method combining U-shaped network and self-attention mechanism | |
CN114119615A (en) | A Radar Segmentation Method Fusing Spatial Attention and Self-Attention Transform Network | |
CN114743123A (en) | Scene understanding method based on implicit function three-dimensional representation and graph neural network | |
Bao et al. | Flow-based point cloud completion network with adversarial refinement | |
CN117911703A (en) | Mine large-scale point cloud semantic segmentation method based on multi-scale feature fusion | |
CN116844004A (en) | Point cloud automatic semantic modeling method for digital twin scene | |
CN116452750A (en) | A 3D object reconstruction method based on mobile terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |