CN115880690B

CN115880690B - Method for quickly labeling objects in point cloud under assistance of three-dimensional reconstruction

Info

Publication number: CN115880690B
Application number: CN202211476652.3A
Authority: CN
Inventors: 郭帅; 石磊; 闻媛; 胡亚洲; 刘起东; 姜晓恒; 徐明亮
Original assignee: Zhengzhou University
Current assignee: Zhengzhou University
Priority date: 2022-11-23
Filing date: 2022-11-23
Publication date: 2023-08-11
Anticipated expiration: 2042-11-23
Also published as: CN115880690A

Abstract

The application relates to the field of artificial intelligence, in particular to a method for quickly marking objects in point clouds under the assistance of three-dimensional reconstruction. And then, according to the optimized RGB-D sample acquisition positions, automatically converting the object labels under the world coordinate system into the RGB-D sample coordinate systems. And obtaining the pose and the label of each object under each RGB-D sample. For each real object in the environment, the method can automatically obtain the object labels of the objects in different RGB-D samples only by manually marking once, and the workload of manual marking can be greatly reduced.

Description

A method for fast labeling of objects in point clouds with the aid of 3D reconstruction

技术领域technical field

本发明涉及人工智能领域，具体涉及一种在三维重建辅助下点云中物体快速标注的方法。The invention relates to the field of artificial intelligence, in particular to a method for quickly marking objects in a point cloud with the aid of three-dimensional reconstruction.

背景技术Background technique

监督学习仍然是深度学习算法的主流方法之一，它对数据的需求非常大，需要大量的标记样本进行训练和评估。例如流行的目标检测算法yolov4，yoloX使用大约40GB数据集MS,COCO进行训练，实现从RGB图像中检测物体。3D物体检测如VoteNet，Group-Free使用SUN,RGB-D数据集进行训练，获得从点云中检测3D物体的能力。同样，庞大的数据集Nuscene也用以支持自动驾驶智能算法的研究。Supervised learning is still one of the mainstream methods of deep learning algorithms, which is very data demanding and requires a large number of labeled samples for training and evaluation. For example, the popular target detection algorithm yolov4, yoloX uses about 40GB data set MS, COCO for training to detect objects from RGB images. 3D object detection such as VoteNet, Group-Free uses SUN, RGB-D datasets for training to obtain the ability to detect 3D objects from point clouds. Similarly, the huge data set Nuscene is also used to support the research of intelligent algorithms for autonomous driving.

除了公共数据集之外，算法在某些特定行业的应用还需要准备特定任务相关的数据集。众包是目前比较流行的一种方法，众包将很多人同时组织起来进行人工标注工作，最后所有完成标记的样本构成最终的数据集，但这项工作费时并且成本昂贵。In addition to public datasets, the application of algorithms in certain industries also requires the preparation of task-related datasets. Crowdsourcing is a popular method at present. Crowdsourcing organizes many people to perform manual labeling work at the same time. Finally, all marked samples constitute the final data set, but this work is time-consuming and expensive.

针对这一问题，提高标注的效率至关重要。一类方法利用无监督或弱监督学习来加速物体注释。通过一种无监督的方法从RGB样本中提取特征，基于这些特征人工注释一小部分样本，然后将标签传播到其他未标记的RGB样本中。此外，为了降低在点云中标注物体的难度，现有的一种方式是通过标注对齐后的RGB图像中的物体，间接实现点云的标注。其过程是只在RGB图像上标注物体的2D方框，然后运行一个自动标注器，可以从2D方框导出的子点云中自动估计出一个高质量的3D方框，从而实现对自动驾驶使用的点云中物体的自动标注。To solve this problem, it is very important to improve the efficiency of labeling. One class of methods exploits unsupervised or weakly supervised learning to accelerate object annotation. Features are extracted from RGB samples through an unsupervised method, a small number of samples are manually annotated based on these features, and then the labels are propagated to other unlabeled RGB samples. In addition, in order to reduce the difficulty of annotating objects in the point cloud, an existing method is to indirectly realize the annotation of the point cloud by annotating the objects in the aligned RGB image. The process is to only mark the 2D box of the object on the RGB image, and then run an automatic tagger, which can automatically estimate a high-quality 3D box from the sub-point cloud derived from the 2D box, so as to realize the use of automatic driving. Automatic labeling of objects in point clouds.

虽然上述已有方法中有其自身的优点，但是一般室内场景RGB-D样本采集密集，相邻采集位置的视野重叠明显，环境中的同一物体会在多个RGB-D样本的视野中重复出现。这种情况下，传统的物体标注需要在多个,RGB-D,样本中多次标注同一物体，大大增加了非必要的人工标注的成本。Although the above-mentioned existing methods have their own advantages, the collection of RGB-D samples in general indoor scenes is dense, and the fields of view of adjacent collection positions overlap significantly, and the same object in the environment will appear repeatedly in the field of view of multiple RGB-D samples. . In this case, traditional object labeling needs to label the same object multiple times in multiple RGB-D samples, which greatly increases the cost of unnecessary manual labeling.

发明内容Contents of the invention

有鉴于此，为了解决上述技术问题，本发明提供一种在三维重建辅助下进行点云中物体快速标注的方法。In view of this, in order to solve the above technical problems, the present invention provides a method for quickly labeling objects in point clouds with the assistance of 3D reconstruction.

本发明采用以下技术方案：The present invention adopts following technical scheme:

一种在三维重建辅助下进行点云中物体快速标注的方法，包括如下步骤：A method for quickly labeling objects in a point cloud with the aid of three-dimensional reconstruction, comprising the following steps:

获取待标注的RGB-D样本序列；Obtain the RGB-D sample sequence to be labeled;

根据RGB-D样本采集位置序列构建SE(3)位姿图，并利用RGB-D样本之间的匹配关系构建采样位置之间的空间约束，进而优化SE(3)位姿图，得到RGB-D样本优化后的采集位置序列；基于优化后的采集位置序列，将RGB-D样本序列融合为全局3D地图；Construct the SE(3) pose graph according to the sequence of RGB-D sample collection positions, and use the matching relationship between RGB-D samples to construct the spatial constraints between the sampling positions, and then optimize the SE(3) pose graph to obtain the RGB-D The optimized collection position sequence of D samples; based on the optimized collection position sequence, the RGB-D sample sequence is fused into a global 3D map;

采用人工标注的方式，在全局3D地图中进行物体标注；Use manual labeling to mark objects in the global 3D map;

根据标注结果，将各物体在世界坐标系中的位姿转换为对应采集位置的传感器坐标中的位姿，得到在每个采集位置，即RGB-D样本下各物体的位姿，进而与物体类别和尺寸一起构成RGB-D样本下的物体标签。According to the labeling results, the pose of each object in the world coordinate system is converted into the pose of the sensor coordinates corresponding to the collection position, and the pose of each object at each collection position, that is, the RGB-D sample, is obtained, and then compared with the object Classes and sizes together form object labels under RGB-D samples.

进一步地，在所述获取待标注的RGB-D样本序列之后，所述点云中物体快速标注方法还包括：通过构建和优化SE(3)位姿图，实现对环境的三维重建。Further, after the acquisition of the RGB-D sample sequence to be labeled, the method for fast labeling objects in the point cloud further includes: realizing 3D reconstruction of the environment by constructing and optimizing the SE(3) pose graph.

进一步地，所述根据RGB-D样本采集位置序列构建SE(3)位姿图，包括：Further, the SE(3) pose graph is constructed according to the RGB-D sample collection position sequence, including:

SE(3)位姿图具体如下：The SE(3) pose graph is as follows:

其中，和ε分别是顶点Vertex和边Edge的缩写；v是SE(3)位姿图中待优化的变量，是所有RGB-D样本的采集位置序列；ε是优化问题的约束集，每个约束/>表示两个顶点s_*和/>点之间的空间约束；设定RGB-D样本序列Q＝{f_i|i＝1…n}，其中f_i是一个RGB-D样本，包括一个RGB彩色图像I_i和一个与深度图像等价的3D点云D_i，f_i＝(I_i,D_i)；采集位置序列为P＝{s_i|i＝0,…,n}，s_i是一个3D空间6自由度位置和姿态点，用SE(3)变量表示。in, and ε are the abbreviations of Vertex and Edge, respectively; v is the variable to be optimized in the SE(3) pose graph, and is the collection position sequence of all RGB-D samples; ε is the constraint set of the optimization problem, and each constraint/ > denote two vertices s _* and /> Spatial constraints between points; set RGB-D sample sequence Q={f _i |i=1...n}, where f _i is an RGB-D sample, including an RGB color image I _i and a depth image, etc. valence 3D point cloud D _i , f _i =(I _i ,D _i ); the collection position sequence is P={s _i |i=0,...,n}, s _i is a 6-DOF position and attitude in a 3D space point, represented by the SE(3) variable.

进一步地，所述利用RGB-D样本之间的匹配关系构建采样位置之间的空间约束，进而优化SE(3)位姿图，得到RGB-D样本优化后的采集位置序列，包括：Further, the space constraint between the sampling positions is constructed by using the matching relationship between the RGB-D samples, and then the SE(3) pose graph is optimized to obtain the optimized acquisition position sequence of the RGB-D samples, including:

通过点云配准确定采集位置之间的空间约束：Spatial constraints between acquisition locations are determined by point cloud registration:

其中，<p,q>表示D_＊和之间的点对，s′_＊和/>分别是s_＊和/>的初始值，/> 作为变量e的初始值；where <p,q> represents D _* and pair of dots between, s′ _* and /> s _* and /> respectively initial value, /> As the initial value of the variable e;

通过下面的方程预测采集位置s_＊和之间的偏移量：Predict the collection position s _* and Offset between:

根据上述空间约束条件和预测偏移量，得到SE(3)位姿图的优化目标如下：According to the above space constraints and predicted offsets, the optimization objective of the SE(3) pose graph is obtained as follows:

其中，X＝{s₁,s₂,…,s_n}是待优化参数的向量，每个为一个误差项的信息矩阵；使用g2o,解决上述优化问题并获得优化的采集点, Among them, X={s ₁ ,s ₂ ,…,s _n } is a vector of parameters to be optimized, each as an error term The information matrix of ; use g2o, solve the above optimization problem and obtain the optimized acquisition point,

进一步地，所述基于优化后的采集位置序列，将RGB-D样本序列融合为全局3D地图，包括：Further, the RGB-D sample sequence is fused into a global 3D map based on the optimized collection position sequence, including:

将所有,RGB-D,样本转换到世界坐标系中：Transform all, RGB-D, samples into world coordinates:

所有,RGB-D,样本都表示在同一个世界坐标系中，共同构成了环境的,3D,地图M:All RGB-D samples are represented in the same world coordinate system and together constitute the 3D map M of the environment:

进一步地，所述采用人工标注的方式，在全局3D地图中进行物体标注，包括：Further, the manual labeling method is used to mark objects in the global 3D map, including:

将上述三维地图M载入进行物体标注的软件，并进行人工物体标注，对于任意一个物体o_j，物体形状用一个3D bounding box(3D-BBox)表示，物体标签格式如下：Load the above three-dimensional map M into the software for object labeling, and perform artificial object labeling. For any object o _j , the shape of the object is represented by a 3D bounding box (3D-BBox). The format of the object label is as follows:

其中，c_j是物体的类别，S_j＝[w_j,l_j,h_j]是3D-BBox,的大小，是物体在世界坐标系中的位姿。Among them, c _j is the category of the object, S _j = [w _j , l _j , h _j ] is the size of the 3D-BBox, is the pose of the object in the world coordinate system.

进一步地，所述根据标注结果，将各物体在世界坐标系中的位姿转换为对应采集位置的传感器坐标中的位姿，即得到各样本中各物体的位姿，进而获得与RGB-D样本相对应的物体标签，包括：Further, according to the labeling results, the pose of each object in the world coordinate system is converted into the pose of the sensor coordinates corresponding to the collection position, that is, the pose of each object in each sample is obtained, and then the RGB-D The object label corresponding to the sample, including:

采用如下矩阵，将物体o_j在世界坐标系中的位姿表达为如下矩阵形式：Using the following matrix, the pose of the object o _j in the world coordinate system is expressed as the following matrix form:

其中，是物体在世界坐标系中的位置，/>是欧拉角等价的三维空间旋转矩阵；in, is the position of the object in the world coordinate system, /> is the Euler angle Equivalent three-dimensional space rotation matrix;

然后，基于各RGB-D样本f_i优化后的采集位置将物体o_j在全局坐标系下的位姿转换到采集时刻传感器坐标中的位姿：Then, based on the optimized acquisition position of each RGB-D sample f _i Transform the pose of the object o _j in the global coordinate system to the pose in the sensor coordinates at the time of collection:

上述位姿旋转矩阵包含绕x和y轴的微小旋转，所以通过旋转矩阵与欧拉角之间的两次变换，将旋转矩阵简化如下：The above pose rotation matrix Contains a small rotation around the x and y axes, so through two transformations between the rotation matrix and Euler angles, the rotation matrix is simplified as follows:

这里，rot2Euler表示获取一个旋转矩阵等价的欧拉角，Euler2rot表示获取欧拉角等价的一个旋转矩阵。Here, rot2Euler means to obtain a Euler angle equivalent to a rotation matrix, and Euler2rot means to obtain a rotation matrix equivalent to an Euler angle.

最终，得到RGB-D,样本f_i中物体o_j的位姿矩阵如下：Finally, RGB-D is obtained, and the pose matrix of the object o _j in the sample f _i is as follows:

最终，通过上述方法自动获取解析后的物体标签如下：Finally, the parsed object tags are automatically obtained through the above method as follows:

其中，为物体o_j在RGB-D样本中的向量形式的位姿。in, is the pose of object o _j in vector form in RGB-D samples.

进一步地，所述在三维重建辅助下进行点云中物体快速标注的方法还包括标签验证的步骤，具体如下：Further, the method for quickly labeling objects in point clouds with the aid of three-dimensional reconstruction also includes a step of label verification, specifically as follows:

针对一个RGB-D帧f_i，使用Open3D点云分割技术获取位于标签内的子点云 For an RGB-D frame f _i , use Open3D point cloud segmentation technology to obtain subpoint cloud within

计算子点云的最小凸包的体积S_j；Computing sub-point clouds The volume S _j of the smallest convex hull of ;

根据表示物体o_j的3D-BBox的体积V_j判断标签的有效性：The validity of the label is judged according to the volume V _j of the 3D-BBox representing the object o _j :

其中，V_j＝w_j·l_j·h_j，表示是物体形状3D-BBox的体积；r是子点云最小凸包的体积与3D-BBox的体积之比，表示在当前RGB-D样本下，存在物体o_j及其标签/>的可信度，仅当/>时，标签/>是有效的，并被接受为RGB-D样本f_i中一个物体o_j的标签。Among them, V _j = w _j l _j h _j , which means the volume of the object shape 3D-BBox; r is the ratio of the volume of the smallest convex hull of the sub-point cloud to the volume of the 3D-BBox, Indicates that under the current RGB-D sample, there is an object o _j and its label /> The reliability of , only if /> when label /> is valid and accepted as the label for an object o _j in an RGB-D sample f _i .

本发明的有益效果为：先根据RGB-D样本序列构建SE(3)位姿图，并利用RGB-D样本之间的匹配关系构建采样位置之间的空间约束，进而优化SE(3)位姿图，得到RGB-D样本优化后的采集位置序列，基于得到的采集位置序列，将RGB-D样本序列融合为全局3D地图，然后采用人工标注的方式，在全局3D地图中进行物体标注，最后，根据标注结果，将各物体在世界坐标系中的位姿转换为对应采集位置的传感器坐标中的位姿，得到每个RGB-D样本中物体的位姿，成为与RGB-D样本相对应的物体标签，本发明中，每个多次出现在不同RGB-D样本中的物体，只需在融合后的3D地图中人工标注一次，就可以自动得到该物体在各RGB-D样本序列中的标签，大大降低了人工标注的成本。The beneficial effects of the present invention are as follows: first construct SE(3) pose graph according to the RGB-D sample sequence, and use the matching relationship between RGB-D samples to construct the spatial constraints between sampling positions, and then optimize the SE(3) position Pose map, obtain the optimized collection position sequence of RGB-D samples, based on the obtained collection position sequence, fuse the RGB-D sample sequence into a global 3D map, and then use manual labeling to mark objects in the global 3D map, Finally, according to the labeling results, the pose of each object in the world coordinate system is converted into the pose of the sensor coordinates corresponding to the acquisition position, and the pose of the object in each RGB-D sample is obtained, which becomes the same as that of the RGB-D sample. Corresponding object labels, in the present invention, each object that appears in different RGB-D samples multiple times only needs to be manually marked once in the fused 3D map, and the object in each RGB-D sample sequence can be automatically obtained. The labels in the system greatly reduce the cost of manual labeling.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对实施例中所需要使用的附图作简单地介绍：In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings used in the embodiments:

图1是本申请提供的一种在三维重建辅助下进行点云中物体快速标注的方法的整体流程示意图。FIG. 1 is a schematic diagram of the overall flow of a method for quickly labeling objects in a point cloud with the aid of 3D reconstruction provided by the present application.

具体实施方式Detailed ways

以下描述中，为了说明而不是为了限定，提出了诸如特定系统结构、技术之类的具体细节，以便透彻理解本申请实施例。然而，本领域的技术人员应当清楚，在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中，省略对众所周知的系统、装置、电路以及方法的详细说明，以免不必要的细节妨碍本申请的描述。In the following description, specific details such as specific system structures and technologies are presented for the purpose of illustration rather than limitation, so as to thoroughly understand the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

应当理解，当在本申请说明书和所附权利要求书中使用时，术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在，但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the term "comprising" indicates the presence of described features, integers, steps, operations, elements and/or components, but does not exclude one or more other Presence or addition of features, wholes, steps, operations, elements, components and/or collections thereof.

还应当理解，在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合，并且包括这些组合。It should also be understood that the term "and/or" used in the description of the present application and the appended claims refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations.

如在本申请说明书和所附权利要求书中所使用的那样，术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地，短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in this specification and the appended claims, the term "if" may be construed, depending on the context, as "when" or "once" or "in response to determining" or "in response to detecting ". Similarly, the phrase "if determined" or "if [the described condition or event] is detected" may be construed, depending on the context, to mean "once determined" or "in response to the determination" or "once detected [the described condition or event] ]” or “in response to detection of [described condition or event]”.

另外，在本申请说明书和所附权利要求书的描述中，术语“第一”、“第二”、“第三”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In addition, in the description of the specification and the appended claims of the present application, the terms "first", "second", "third" and so on are only used to distinguish descriptions, and should not be understood as indicating or implying relative importance.

在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此，在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例，而是意味着“一个或多个但不是所有的实施例”，除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”，除非是以其他方式另外特别强调。Reference to "one embodiment" or "some embodiments" or the like in the specification of the present application means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "including", "comprising", "having" and variations thereof mean "including but not limited to", unless specifically stated otherwise.

为了说明本申请所述的技术方案，下面通过具体实施方式来进行说明。In order to illustrate the technical solution described in the present application, the following description will be given through specific implementation methods.

本实施例提供一种在三维重建辅助下进行点云中物体快速标注的方法，能够通过创建和优化基于RGB-D样本采集位置序列的SE(3)位姿图(Special Euclidean,group,for,n＝3，SE(3))，将RGB-D样本序列融合为全局3D地图。然后在3D全局地图中进行人工物体标注，再通过标签解析自动地将这些标签自动转换为每个RGB-D样本中的物体标签，最终实现快速的物体标注。参见图1，下面说明本实施例提供的一种在三维重建辅助下进行点云中物体快速标注的方法的各个实现步骤。This embodiment provides a method for quickly labeling objects in point clouds with the assistance of three-dimensional reconstruction, which can create and optimize SE(3) pose graphs (Special Euclidean, group, for, n=3, SE(3)), which fuses sequences of RGB-D samples into a global 3D map. Then perform artificial object labeling in the 3D global map, and then automatically convert these labels into object labels in each RGB-D sample through label analysis, and finally achieve fast object labeling. Referring to FIG. 1 , the implementation steps of a method for quickly labeling objects in a point cloud with the aid of 3D reconstruction provided by this embodiment will be described below.

步骤S1：获取待标注的RGB-D样本序列：Step S1: Obtain the RGB-D sample sequence to be labeled:

作为一个具体应用场景，使用定制开发的机器人平台采集数据。Kinect,v2传感器固定在机器人上方,1.5,米的高度，用于采集待标注的RGB-D,样本序列。通过机器人运动学模型和电机驱动器的数据获得里程计，里程计可以在,SE(3)位姿图的构建过程中提供初始采集位置序列。为了减少机器人抖动对3D重建的副作用，使用6Dof采集位姿来表示传感器的轨迹。As a specific application scenario, a custom-developed robot platform is used to collect data. The Kinect,v2 sensor is fixed above the robot at a height of 1.5 meters, and is used to collect RGB-D sample sequences to be marked. The odometry is obtained through the data of the robot kinematics model and the motor driver, and the odometry can provide the initial acquisition position sequence during the construction of the SE(3) pose graph. In order to reduce the side effects of robot shaking on 3D reconstruction, 6Dof is used to acquire poses to represent the trajectory of the sensor.

RGB-D传感器的坐标系是x-rightward、y-downward和z-forward。因为传感器是倾斜安装的，所以在传感器坐标系表示的点云中，地面与xoy平面不平行，不满足所采用的标注软件labelCloud的要求。为了解决这个问题，需要先定义一个x-rightward，y-forward，z-upward的坐标系，其xoy平面与地面平行，所以处理后的点云中的地面平行于,xoy,平面。The coordinate system of the RGB-D sensor is x-rightward, y-downward and z-forward. Because the sensor is installed obliquely, in the point cloud represented by the sensor coordinate system, the ground is not parallel to the xoy plane, which does not meet the requirements of the labeling software labelCloud used. In order to solve this problem, it is necessary to define an x-rightward, y-forward, z-upward coordinate system first, and its xoy plane is parallel to the ground, so the ground in the processed point cloud is parallel to the xoy plane.

步骤S2：根据RGB-D样本采集位置序列构建SE(3)位姿图，并利用RGB-D样本之间的匹配关系构建采样位置之间的空间约束，进而优化SE(3)位姿图，得到RGB-D样本优化后的采集位置序列；基于优化后的采集位置序列X^*，将RGB-D样本序列融合为全局3D地图：Step S2: Construct the SE(3) pose graph according to the RGB-D sample collection position sequence, and use the matching relationship between RGB-D samples to construct the spatial constraints between the sampling positions, and then optimize the SE(3) pose graph, Obtain the optimized acquisition position sequence of RGB-D samples; based on the optimized acquisition position sequence X ^* , fuse the RGB-D sample sequence into a global 3D map:

作为一个具体实施方式，获取待标注的RGB-D样本序列之后，需要对环境进行三维重建，具体如下：As a specific implementation, after obtaining the RGB-D sample sequence to be marked, it is necessary to perform three-dimensional reconstruction of the environment, as follows:

设定RGB-D样本序列Q＝{f_i|i＝1…n}，其中f_i是一个RGB-D样本，包括一个RGB彩色图像I_i和一个与深度图像等价的3D点云D_i，f_i＝(I_i,D_i)；那么，三维重建的一个目标就是优化采集位置序列P＝{s_i|i＝0,…,n}，s_i是一个3D空间6自由度位置和姿态点，用SE(3)变量表示。Set RGB-D sample sequence Q={f _i |i=1...n}, where f _i is an RGB-D sample, including an RGB color image I _i and a 3D point cloud D _i equivalent to a depth image , f _i =(I _i ,D _i ); then, one goal of 3D reconstruction is to optimize the acquisition position sequence P={s _i |i=0,…,n}, s _i is a 6-DOF position in 3D space and Attitude point, represented by SE(3) variable.

为了实现上述目标，基于RGB-D样本序列，采用如下过程构建SE(3)位姿图：In order to achieve the above goals, based on the RGB-D sample sequence, the following process is used to construct the SE(3) pose graph:

SE(3)位姿图具体如下：The SE(3) pose graph is as follows:

其中，和ε分别是顶点Vertex和边Edge的缩写，分别表示顶点和边；/>是SE(3)位姿图中待优化的变量，是所有RGB-D样本的采集位置序列集合；ε是优化问题的约束集，每个约束/>表示两个顶点s_*和/>点之间的空间约束。in, and ε are the abbreviations of vertex Vertex and edge Edge respectively, representing vertex and edge respectively; /> is the variable to be optimized in the SE(3) pose graph, and is the collection position sequence set of all RGB-D samples; ε is the constraint set of the optimization problem, each constraint denote two vertices s _* and /> Spatial constraints between points.

本实施例设定采集位置序列是连续的，相邻采集位置之间的距离很小，因此相邻RGB-D样本之间的重叠占RGB-D传感器视场的相当大的比例。基于上述设定，如下给出根据RGB-D样本之间的空间约束优化SE(3)位姿图，得到RGB-D样本优化后的所有采集位置序列的一种具体实现过程：In this embodiment, it is assumed that the sequence of acquisition positions is continuous, and the distance between adjacent acquisition positions is very small, so the overlap between adjacent RGB-D samples accounts for a considerable proportion of the RGB-D sensor field of view. Based on the above settings, a specific implementation process of optimizing the SE(3) pose graph according to the spatial constraints between RGB-D samples to obtain all the acquisition position sequences after the RGB-D samples are optimized is given as follows:

通过点云配准确定空间采集位姿之间的空间约束：Determining the spatial constraints between spatially acquired poses through point cloud alignment:

其中，<p,q>表示D_＊和之间的点对，s′_＊和/>分别是s_＊和/>的初始值，/> 作为变量e优化过程的初始值。根据＊和/>的关系，将上述约束分为两种情况。,当/>时，上述约束是两个连续采集位置之间的空间约束；反之，如果/>它是通过检测传感器轨迹的闭环建立的两个长间隔采集位置之间的空间约束。where <p,q> represents D _* and pair of dots between, s′ _* and /> s _* and /> respectively initial value, /> as the initial value of the variable e for the optimization process. According to * and /> The above constraints are divided into two cases. , when /> When , the above constraint is the spatial constraint between two consecutive acquisition locations; conversely, if /> It is a spatial constraint between two long-spaced acquisition locations established by detecting closed loops of sensor trajectories.

其中，X＝{s₁,s₂,…,s_n}是待优化参数的向量，每个为一个误差项的信息矩阵；使用g2o,解决上述优化问题并获得优化的采集位置序列, Among them, X={s ₁ ,s ₂ ,…,s _n } is a vector of parameters to be optimized, each as an error term The information matrix of ; using g2o, solve the above optimization problem and obtain an optimized sequence of acquisition positions,

以下给出基于得到的采集位置序列，将RGB-D样本序列融合为全局3D地图的一种具体实现过程：The following is a specific implementation process of fusing the RGB-D sample sequence into a global 3D map based on the obtained collection position sequence:

基于优化后的采集位置序列，将所有,RGB-D,样本转换到世界坐标系中：Based on the optimized acquisition position sequence, convert all RGB-D samples to the world coordinate system:

所有,RGB-D,样本都表示在同一个世界坐标系中，共同构成了环境的,3D,地图M：All RGB-D samples are represented in the same world coordinate system and together constitute the 3D map M of the environment:

经过上述操作，可以得到全局的3D地图。After the above operations, a global 3D map can be obtained.

步骤S3：采用人工标注的方式，在全局3D地图中进行物体标注：Step S3: Use manual labeling to mark objects in the global 3D map:

得到全局的3D地图之后，采用人工标注的方式，在全局3D地图中进行物体标注，标注对象类型可以根据特定任务需要灵活进行。本实施例中，采用labelCloud软件在3D,地图M中进行人工物体标注。labelCloud软件支持以3D边界框(3D-BBox)的形式对点云中的物体进行标注。对于任意一个o_j，标签格式如下：After obtaining the global 3D map, use manual labeling to mark objects in the global 3D map. The type of marked objects can be flexibly performed according to the needs of specific tasks. In this embodiment, labelCloud software is used to label artificial objects in 3D, map M. The labelCloud software supports labeling objects in the point cloud in the form of 3D bounding boxes (3D-BBox). For any o _j , the label format is as follows:

其中，c_j是物体的类别，S_j＝[w_j,l_j,h_j]是3D边界框3D-BBox,(3D bounding ,box)的大小，是物体在世界坐标系中的位姿。Among them, c _j is the category of the object, S _j = [w _j , l _j , h _j ] is the size of the 3D bounding box 3D-BBox, (3D bounding , box), is the pose of the object in the world coordinate system.

本实施例设定物体垂直于水平,xoy,平面站立，因此围绕x和y轴的旋转角度为0。在标签的上述三个数量中，类别c_j和大小S_j是物体的基本属性，不随位置发生改变；相反，姿态与表示物体的坐标系有关。This embodiment assumes that the object stands perpendicular to the horizontal, xoy, plane, so the rotation angle around the x and y axes is 0. Among the above three quantities of labels, the category c _j and size S _j are the basic attributes of the object, which do not change with the position; on the contrary, the attitude It is related to the coordinate system representing the object.

步骤S4：根据标注结果，将各物体在世界坐标系中的位姿转换为对应采集位置的传感器坐标中的位姿，得到在每个采集位置，即RGB-D样本下每个物体的位姿，与物体类别和尺寸一起构成RGB-D样本下的物体标签：Step S4: According to the labeling results, convert the pose of each object in the world coordinate system to the pose in the sensor coordinates corresponding to the collection position, and obtain the pose of each object at each collection position, that is, the RGB-D sample , together with the object category and size constitute the object label under the RGB-D sample:

对于标注的结果，其中类别和大小不随物体的位置而变化，是物体的基本属性。但是，物体位姿与表示物体的坐标系有关。在世界坐标系和传感器坐标系中表示的物体位姿是不同的。为了实现RGB-D,样本序列的标注，本实施例将每个物体在世界坐标系中的位姿转换为观察到它的采集位置的传感器坐标中的位姿。对于每个物体o_j，使用以下矩阵的形式表示其位姿：For the labeled results, the category and size do not change with the position of the object, which is the basic attribute of the object. However, object pose is related to the coordinate system representing the object. The object pose represented in the world coordinate system and the sensor coordinate system are different. In order to realize RGB-D, labeling of sample sequences, this embodiment converts the pose of each object in the world coordinate system to the pose in the sensor coordinates where its collection position is observed. For each object o _j , its pose is expressed in the form of the following matrix:

其中，是物体在世界坐标系中的位置，/>是由欧拉角变换而成的三维空间中的旋转矩阵。in, is the position of the object in the world coordinate system, /> is given by the Euler angles Transformed into a rotation matrix in 3D space.

基于各RGB-D样本f_i优化后的采集位置将物体o_j在全局坐标系下的位姿转换到采集时刻传感器坐标中的位姿；Based on the optimized acquisition position of each RGB-D sample f _i Convert the pose of the object o _j in the global coordinate system to the pose in the sensor coordinates at the time of collection;

本实施例中，上述公式(10)左侧的上标0表示这是一个初步采集位置。因为在公式(5)中的问题优化之后，在旋转矩阵中，有围绕x轴和y轴的小旋转。经过上述公式(10)中的传导，位姿旋转矩阵/>也会包含绕x和y轴的微小旋转，所以通过旋转矩阵与欧拉角之间的两次变换，将旋转矩阵简化如下：In this embodiment, the superscript 0 on the left side of the above formula (10) indicates that this is a preliminary collection position. Because after the optimization of the problem in formula (5), the rotation matrix , there is a small rotation around the x-axis and y-axis. After the above formula (10) The conduction, pose rotation matrix /> It also includes a small rotation around the x and y axes, so through two transformations between the rotation matrix and Euler angles, the rotation matrix is simplified as follows:

得到RGB-D,样本,f_i中物体o_j相对应的标签，即得到位姿及其符合标签格式要求的标签，成为与RGB-D样本相对应的物体标签：Obtain the label corresponding to the object o _j in RGB-D, sample, f _i , that is, get the pose and its label that meets the label format requirements, and become the object label corresponding to the RGB-D sample:

另外，本实施例提供的一种在三维重建辅助下进行点云中物体快速标注的方法还包括如下标签验证的步骤：In addition, a method for quickly labeling objects in a point cloud with the aid of 3D reconstruction provided in this embodiment also includes the following steps of label verification:

上述标签只计算样本f_i中标签的空间位姿，本实施例使用点云D_i来检查标签的有效性。首先，使用Open3D分割技术获取位于标签/>内的子点云/>它位于标签/>的3D-BBox内。然后计算子点云的最小凸包体积S_j，最后根据3D-BBox的体积V_j判断标签的有效性：above label Only the spatial poses of the labels in the sample f _i are calculated, and this embodiment uses the point cloud D _i to check the validity of the labels. First, use the Open3D segmentation technique to obtain the subpoint cloud within /> it is located in the tag /> Inside the 3D-BBox. Then calculate the minimum convex hull volume S _j of the sub-point cloud, and finally judge the validity of the label according to the volume V _j of the 3D-BBox:

以上所述实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的范围，均应包含在本申请的保护范围之内。The above-described embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still implement the foregoing embodiments Modifications to the technical solutions described in the examples, or equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of each embodiment of the application, and should be included in the scope of the technical solutions of the embodiments of the application. within the scope of protection.

Claims

1. The method for quickly labeling the object in the point cloud under the assistance of three-dimensional reconstruction is characterized by comprising the following steps of:

acquiring an RGB-D sample sequence to be marked;

building an SE (3) pose chart according to the RGB-D sample acquisition position sequence, and building space constraint among sampling positions by utilizing a matching relation among RGB-D samples, so as to optimize the SE (3) pose chart, and obtain an acquisition position sequence after RGB-D sample optimization; based on the optimized acquisition position sequence, fusing the RGB-D sample sequence into a global 3D map;

performing object labeling in the global 3D map by adopting a manual labeling mode;

according to the labeling result, converting the pose of each object in the world coordinate system into the pose in the sensor coordinate corresponding to the acquisition position to obtain the pose of each object in each acquisition position, namely the RGB-D sample, and further forming an object label in the RGB-D sample together with the object type and the size;

the construction of the SE (3) pose graph according to the RGB-D sample acquisition position sequence comprises the following steps:

the SE (3) pose map is specifically as follows:

wherein ,and ε are abbreviations for Vertex Vertex and Edge, respectively; />The variable to be optimized in the SE (3) pose graph is a collection position sequence set of all RGB-D samples; epsilon is a constraint set of optimization problems, each constraint +.>Representing two vertices s _* and />Space constraints between points; setting the RGB-D sample sequence Q= { f _i I= … n, where f _i Is an RGB-D sample comprising an RGB color image I _i And a 3D point cloud D equivalent to the depth image _i ，f _i ＝(I _i ，D _i ) The method comprises the steps of carrying out a first treatment on the surface of the The acquisition position sequence is P= { s _i |i＝0，...，n}，s _i Is a 3D space 6 degree of freedom position and attitude point, expressed by SE (3) variable;

the construction of spatial constraint between sampling positions by using the matching relation between RGB-D samples, and further optimizing an SE (3) pose diagram, to obtain an acquisition position sequence after RGB-D sample optimization, comprises the following steps:

determining spatial constraints between acquisition location pairs by point cloud registration:

wherein,<p，q>representation D _＊ Andpairs of points between, s' _＊ and />Respectively s _★ and />Initial value of-> As an initial value of the variable e;

the acquisition position s is predicted by the following equation _★ Andoffset between:

according to the space constraint condition and the predicted offset, the optimization target for obtaining the SE (3) pose map is as follows:

wherein x= { s ₁ ，s ₂ ，...，s _n Is a vector of parameters to be optimized, eachIs an error item->Is a matrix of information of (a); solving the above-mentioned optimization problem using g2o and obtaining an optimized acquisition position sequence +.>

The acquisition position sequence X based on optimization ^* Fusing the RGB-D sample sequence into a global 3D map, comprising:

all RGB-D samples are converted into world coordinate system:

all RGB-D samples are represented in the same world coordinate system, together constituting a 3D map M of the environment:

the method for labeling the object in the global 3D map by adopting the manual labeling mode comprises the following steps:

loading the three-dimensional map M into software for object labeling, and labeling artificial objects; for any one object o _j The object shape is represented by a 3D bounding box, and the object label format is as follows:

wherein ,c_j Is the category of the object S _j ＝[w _j ，l _j ，h _j ]Is the size of 3D-BBox, is the pose of an object in a world coordinate system;

setting the object to stand perpendicular to a horizontal xoy plane in a world coordinate system, wherein the rotation angle of the object around x and y axes is 0;

according to the labeling result, converting the pose of each object in the world coordinate system into the pose in the sensor coordinate corresponding to the acquisition position, so as to obtain the pose of each object in each sample, and further obtaining an object label corresponding to the RGB-D sample, which comprises the following steps:

the object o is represented by the following matrix _j The pose in the world coordinate system is expressed in the form of the following matrix:

wherein ,is the position of the object in world coordinate system, < >>Is Euler angleAn equivalent three-dimensional spatial rotation matrix;

then, based on each RGB-D sample f _i Optimized acquisition positionObject o _j The pose under the global coordinate system is converted into the pose in the sensor coordinate at the acquisition time:

above-mentioned pose rotation matrixIncluding slight rotations about the x-axis and the y-axis, the rotation matrix is simplified by two transformations between the rotation matrix and the euler angle as follows:

here, rot2Euler means obtaining an Euler angle equivalent to a rotation matrix, and Euler2rot means obtaining a rotation matrix equivalent to an Euler angle;

finally, an RGB-D sample f is obtained _i Intermediate object o _j The pose matrix of (a) is as follows:

finally, the object label after analysis is automatically acquired by the method as follows:

wherein ,for object o _j Pose in vector form in RGB-D samples.

2. The method for rapid labeling of objects in a point cloud with the aid of three-dimensional reconstruction of claim 1,

the method for quickly labeling the object in the point cloud under the assistance of three-dimensional reconstruction further comprises the step of label verification, and specifically comprises the following steps:

for one RGB-D frame f _i Acquiring a location tag by using Open3D point cloud segmentation technologySub-point cloud inside->

Computing sub-point cloudIs the minimum convex hull volume S of _j ；

From the representation object o _j Volume V of 3D-BBox of (B) _j Judging the validity of the label:

wherein ,V_j ＝w _j ·l _j ·h _j Representing the volume of the object shape 3D-BBox; r is the ratio of the volume of the minimum convex hull of the sub-point cloud to the volume of the 3D-BBox,represented at the current RGB-D sampleUnder, there is an object o _j And label->Is only whenWhen (1) label->Is effective and accepted as RGB-D sample f _i An object o of (a) _j Is a label of (a).