CN110751684B

CN110751684B - Object three-dimensional reconstruction method based on depth camera module

Info

Publication number: CN110751684B
Application number: CN201910994807.4A
Authority: CN
Inventors: 耿立帅; 朝红阳
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2019-10-18
Filing date: 2019-10-18
Publication date: 2023-09-29
Anticipated expiration: 2039-10-18
Also published as: CN110751684A

Abstract

The invention relates to the technical field of three-dimensional reconstruction in the field of computer vision, and relates to a three-dimensional reconstruction method of objects based on a depth camera module. The present invention adopts a new hash method: MD5 in the voxel hash algorithm process, which can greatly improve the speed of data insertion, search and indexing, and reduce collisions; in addition, the present invention proposes a new memory allocation method , which solves the defect of one-time allocation of fixed-size memory in the existing method, so that memory can be automatically allocated during the reconstruction process, and the purpose of dynamically expanding the reconstruction area can be achieved; the present invention adopts a new two-frame alignment method, and calculates Select the orb feature points of the two frames of color images and select the corresponding point pairs. Use the corresponding points to obtain the world coordinates of the previous frame and the camera coordinates of the next frame from the depth map, thereby solving the camera extrinsic parameters of the next frame. By aligning the two frames before and after, this method can more accurately find corresponding points and reduce the drift problem in three-dimensional reconstruction.

Description

Object 3D reconstruction method based on depth camera module

技术领域Technical field

本发明涉及计算机视觉领域下的三维重建技术领域，更具体地，涉及一种基于深度摄像模组的物体三维重建方法。The present invention relates to the technical field of three-dimensional reconstruction in the field of computer vision, and more specifically, to a three-dimensional reconstruction method of an object based on a depth camera module.

背景技术Background technique

近些年来，计算机视觉技术蓬勃发展，利用计算机视觉技术进行三维重建已经变得更加高效和便捷，并在文物保护及复原、工业检查以及3D打印方面有了很多应用，并且也是VR和AR应用技术的重要组成部分。早期的三维重建技术通常以二维图像作为输入，重建场景中的三维模型，因为这种重建模式本身的数据问题，重建出的三维模型容易存在空洞和失真的现象。近年来各种深度摄像机的发明，如kinect、TOF、RealSense等极大地推动了三维重建技术的发展，更多的技术开始偏向于利用深度摄像机进行实时三维重建。In recent years, computer vision technology has developed rapidly. The use of computer vision technology for three-dimensional reconstruction has become more efficient and convenient, and has many applications in cultural relic protection and restoration, industrial inspection, and 3D printing. It is also a VR and AR application technology. important parts of. Early 3D reconstruction technology usually uses 2D images as input to reconstruct the 3D model in the scene. Due to the data problems of this reconstruction mode itself, the reconstructed 3D model is prone to holes and distortions. In recent years, the invention of various depth cameras, such as kinect, TOF, RealSense, etc., has greatly promoted the development of 3D reconstruction technology, and more technologies have begun to favor the use of depth cameras for real-time 3D reconstruction.

实时的三维重建技术利用不同角度下的深度数据，并将其转换到同一坐标系下从而实现表面的重建和渲染，兼顾高的重建质量、大的重建规模以及快的重建速度是十分困难并且具有挑战的。基于Kinectfusion算法的三维重建技术基本实现了实时性的要求，但是其只能处理小场景的重建任务，而且其重建过程十分消耗显存。基于体素哈希算法的三维重构方法，算是一种比较优秀的重构方法，它采用了体素哈希的方法，兼顾了速度和场景大小的需求，但是仍存在以下缺点：1.该技术所依赖的重要步骤体素哈希所用的哈希函数H(x,y,z)＝(x·p₁⊕y·p₂⊕z·p₃)％n产生哈希碰撞的概率较高，较多的哈希碰撞次数会严重拖慢算法的执行效率，并且增加了内存溢出的风险；2.该技术在执行体素哈希过程当中需要一次性分配固定大小的内存空间，限制了三维重建的可扩展性。Real-time 3D reconstruction technology uses depth data at different angles and converts it to the same coordinate system to achieve surface reconstruction and rendering. It is very difficult and costly to take into account high reconstruction quality, large reconstruction scale, and fast reconstruction speed. Challenging. The three-dimensional reconstruction technology based on the Kinectfusion algorithm basically meets the real-time requirements, but it can only handle the reconstruction task of small scenes, and its reconstruction process consumes a lot of video memory. The three-dimensional reconstruction method based on the voxel hash algorithm is considered a relatively excellent reconstruction method. It uses the voxel hash method and takes into account the requirements of speed and scene size. However, it still has the following shortcomings: 1. The The important steps on which the technology relies: The hash function used in voxel hashing H (x, y, z) = (x·p ₁ ⊕y·p ₂ ⊕z·p ₃ )%n has a high probability of generating a hash collision , a larger number of hash collisions will seriously slow down the execution efficiency of the algorithm, and increase the risk of memory overflow; 2. This technology requires a fixed-size memory space to be allocated at one time during the voxel hashing process, which limits the three-dimensional Rebuild scalability.

发明内容Contents of the invention

本发明为克服上述现有技术中体素哈希所用的哈希函数产生哈希碰撞的概率较高，严重拖慢算法的执行效率，并且增加了内存溢出的风险的缺陷，提供一种基于深度摄像模组的物体三维重建方法，采用了一种新的哈希函数方法，可以大幅提高数据插入、查找和索引的速度，减少碰撞。In order to overcome the above-mentioned defects in the prior art, the hash function used for voxel hashing has a high probability of hash collision, seriously slows down the execution efficiency of the algorithm, and increases the risk of memory overflow, and provides a depth-based method. The camera module's object three-dimensional reconstruction method uses a new hash function method, which can greatly increase the speed of data insertion, search and indexing, and reduce collisions.

为解决上述技术问题，本发明采用的技术方案是：一种基于深度摄像模组的物体三维重建方法，采用基于体素哈希的算法实现三维重构；其特征在于，在体素哈希过程中采用一种新的哈希函数方法计算得到哈希值，新的哈希函数方法包括以下步骤：In order to solve the above technical problems, the technical solution adopted by the present invention is: a three-dimensional reconstruction method of objects based on a depth camera module, using an algorithm based on voxel hashing to achieve three-dimensional reconstruction; it is characterized in that, in the voxel hashing process A new hash function method is used to calculate the hash value. The new hash function method includes the following steps:

首先，经过如下公式将三维坐标转化为一维索引：First, convert the three-dimensional coordinates into one-dimensional indexes through the following formula:

式中，Δ为当前设备分辨率的大小，即一个体素的大小；In the formula, Δ is the size of the current device resolution, that is, the size of one voxel;

然后，将计算好的一维索引数据通过MD5方法转化为哈希值。Then, the calculated one-dimensional index data is converted into a hash value through the MD5 method.

进一步的，所述的MD5转换过程具体包括以下步骤：Further, the MD5 conversion process specifically includes the following steps:

S21.在一维索引数据后面添加一个1和若干0，使字节长度模512得448，数据在被填补前的长度表现在最后的64位，添补后的数据长度是512的整数倍；在填补前的长度表现在最后64位的意思是指最后64位的2进制数字能转成一个10进制数字，就是代表的未填补前的数据长度；S21. Add a 1 and several 0s after the one-dimensional index data, so that the byte length modulo 512 is 448. The length of the data before being padded is shown in the last 64 bits. The data length after padding is an integer multiple of 512; in The length before padding is expressed in the last 64 bits, which means that the last 64 bits of binary numbers can be converted into a decimal number, which represents the length of the data before padding;

S22.以512位为分组处理填补后的数据，每个分组又分为16个32位子块，使用4个32位寄存器循环处理子块；用四个链接变量作为参数初始化MD5，这4个变量分别为：a＝0x67452301，b＝0xefcdab89，c＝0x98badcfe，d＝0x10325476；S22. Process the padded data in groups of 512 bits. Each group is divided into 16 32-bit sub-blocks. Four 32-bit registers are used to process the sub-blocks in a loop. MD5 is initialized with four link variables as parameters. These four variables They are: a=0x67452301, b=0xefcdab89, c=0x98badcfe, d=0x10325476;

S23.进行四轮循环压缩运算，每一轮有16步，每步使用一个消息，所述的消息是指以512位为一个分组的数据；利用步函数分别更新变量a、b、c、d；所述的步函数为Q_i+1＝Q_i+1+((Q_i-3+f_i(Q_i,Q_i-1,Q_i-2)+w_i+t_i)＜＜＜s_i)；S23. Perform four rounds of cyclic compression operation, each round has 16 steps, and each step uses a message. The message refers to data grouped into 512 bits; use step functions to update variables a, b, c, and d respectively. ; The step function is Q _i+1 =Q _i+1 +((Q _i-3 +f _i (Q _i , Q _i-1 , Q _i-2 ) + w _i +t _i )＜＜＜ s _i );

S24.将4个32位子块级联得到一个128位值，即最终的哈希值。abcd每个参数有32位，最后处理完所有消息之后得到的压缩结果就是abcd的级联值32*4＝128。S24. Concatenate four 32-bit sub-blocks to obtain a 128-bit value, which is the final hash value. Each parameter of abcd has 32 bits. The final compression result obtained after processing all messages is the cascade value of abcd 32*4=128.

进一步的，本发明提出一种内存动态管理方法，现有的体素哈希的哈希表采用了预先分配内存的方法，这样虽然可以提高索引的效率，但是限制了三维重建场景的范围，本发明借鉴了c++11当中标准模板库中的hash map(哈希匹配)方式，实现了内存的动态扩展，当预分配内存用满，则动态扩展哈希表内存，实现更大范围内的三维重建。所述的动态内存管理方法包括以下步骤：在哈希表中添加一个整形变量，该变量代表当前哈希表中可用位置的数目，当该变量的值接近0的时候，停止对哈希表的插入，然后开辟同等大小的新哈希表并且将新表的表头指向旧表的表尾，然后对哈希表进行重构；假设原本哈希表的大小为n，此时新表的长度变为了2n，对之前所有已经插入的元素的哈希值模上2n得到新的桶号，并且将之前的旧表中的元素移动到扩展后的新表当中；重构完成后即可对新的哈希表进行插入操作。进行计算时，获取了哈希值之后，通过模上N来获取它要放在第几个桶当中，进行扩容之后，桶的数目由原来的N变成了2N，所以就要将之前元素的哈希值模上2n。Further, the present invention proposes a memory dynamic management method. The existing voxel hash hash table adopts the method of pre-allocating memory. Although this can improve the efficiency of indexing, it limits the scope of the three-dimensional reconstruction scene. This method The invention draws on the hash map (hash matching) method in the standard template library in C++11 to achieve dynamic expansion of memory. When the pre-allocated memory is full, the hash table memory is dynamically expanded to achieve a wider range of Three-dimensional reconstruction. The described dynamic memory management method includes the following steps: adding an integer variable in the hash table, which represents the number of available positions in the current hash table. When the value of the variable is close to 0, stop modifying the hash table. Insert, then open a new hash table of the same size and point the header of the new table to the tail of the old table, and then reconstruct the hash table; assuming that the size of the original hash table is n, the length of the new table at this time It becomes 2n. The hash value of all previously inserted elements is modulo 2n to get the new bucket number, and the elements in the old table are moved to the expanded new table; after the reconstruction is completed, the new bucket number can be obtained Hash table for insertion operations. When calculating, after obtaining the hash value, use modulo N to obtain the bucket in which it should be placed. After expansion, the number of buckets changes from the original N to 2N, so the previous elements must be The hash value modulo 2n.

进一步的，基于深度摄像模组的物体三维重建方法具体包括以下步骤：Further, the object three-dimensional reconstruction method based on the depth camera module specifically includes the following steps:

S1.设定当前架构下的世界观：三维重建技术本质上是建立一个足够大的空间体素集，使其将想要重建的物体包裹其中，这个母空间体素集向下又可以分为三层子结构：chunk为一级子结构，空间体素集当中包含n*n*n个chunk结构；block为二级子结构，每个chunk当中又包含m*m*m个block结构；voxel为三级子结构，每个block结构中包含t*t*t个voxel子结构；上述中的变量m、n、t均为正整数变量，可以根据具体情况进行设置；S1. Set the world view under the current architecture: 3D reconstruction technology essentially establishes a large enough space voxel set to wrap the object to be reconstructed. This parent space voxel set can be divided into three downwards. Layer substructure: chunk is a first-level substructure, and the spatial voxel set contains n*n*n chunk structures; block is a second-level substructure, and each chunk contains m*m*m block structures; voxel is Three-level substructure, each block structure contains t*t*t voxel substructures; the variables m, n, and t mentioned above are all positive integer variables and can be set according to specific circumstances;

S2.建立基于TOF模组的“视频流”，从中获取其当前时间下的深度图、RGB图以及点云，并且将TOF模组的相机内参K、当前姿态下的相机外参T读取并储存到相关参数当中；S2. Establish a "video stream" based on the TOF module, obtain its depth image, RGB image and point cloud at the current time, and read the TOF module's camera internal parameters K and the camera external parameters T at the current attitude and read them. Store it in relevant parameters;

S3.基于前后帧的彩色图，计算前后两帧彩色图的orb特征点并挑选好的对应点对，利用对应点从深度图中得到前一帧的世界坐标以及后一帧的相机坐标，从而求解出后一帧的相机外参，使当前帧下的点云对到标准帧(第一帧)所在的世界坐标系当中；S3. Based on the color images of the previous and next frames, calculate the ORB feature points of the two color images before and after and select the corresponding point pairs. Use the corresponding points to obtain the world coordinates of the previous frame and the camera coordinates of the next frame from the depth map, thereby Solve the camera extrinsic parameters of the next frame so that the point cloud under the current frame is aligned with the world coordinate system where the standard frame (first frame) is located;

S4.通过基于MD5编码方式的体素哈希方法在GPU设备端建立可扩展的动态哈希表；基于视锥原理将视锥范围内的block从主机端移动到设备端的哈希表当中；S4. Establish a scalable dynamic hash table on the GPU device side through the voxel hash method based on MD5 encoding; move the blocks within the view cone range from the host side to the hash table on the device side based on the frustum principle;

S5.基于当前帧的深度图并结合哈希表中的信息，挑选出其中在深度区域截断范围内有效的block，在GPU设备端中通过内存动态管理方法建立一个新的可动态分配的哈希表，并将有效的block移动到新的哈希表当中，新哈希表中记录所有有效block所对应的位置信息以及其所指向的voxel数组的位置；S5. Based on the depth map of the current frame and combined with the information in the hash table, select the blocks that are valid within the truncation range of the depth area, and create a new dynamically allocated hash on the GPU device side through the memory dynamic management method. table, and move valid blocks to a new hash table. The new hash table records the location information corresponding to all valid blocks and the location of the voxel array it points to;

S6.遍历新哈希表中的所有block位置，并且利用cuda发射多线程遍历该block中全部voxel，根据母空间体素集在世界坐标系中起点的初始位置以及其中每个voxel的真实长度，计算该voxel的世界坐标系位置，利用反投影公式将其从世界坐标系位置投影到像素坐标系位置，判断此voxel所在世界坐标位置是否在该深度帧范围内真实可见，如果不可见则不做处理，否则利用TSDF-截断有向距离场公式更新其TSDF值以及更新当前权重；S6. Traverse all block positions in the new hash table, and use CUDA to launch multi-threads to traverse all voxels in the block. According to the initial position of the starting point of the parent space voxel set in the world coordinate system and the real length of each voxel, Calculate the world coordinate system position of the voxel, use the back-projection formula to project it from the world coordinate system position to the pixel coordinate system position, and determine whether the world coordinate position of the voxel is truly visible within the depth frame range. If it is not visible, do not do it. Process, otherwise use the TSDF-truncated directed distance field formula to update its TSDF value and update the current weight;

S7.TSDF值相当于等值面的集合，TSDF值为0的位置相当于物体表面，利用Marching Cubes技术生成三角表面网格，并对其进行渲染；S7. The TSDF value is equivalent to a collection of isosurfaces. The position where the TSDF value is 0 is equivalent to the surface of the object. The Marching Cubes technology is used to generate a triangular surface mesh and render it;

S8.将新哈希表中所有的voxel从GPU设备端拷贝回主机端，在主机端记录该位置voxel的内容，并且释放GPU设备端的哈希表，防止造成显存泄露；S8. Copy all the voxels in the new hash table from the GPU device back to the host, record the content of the voxel at that location on the host, and release the hash table on the GPU device to prevent video memory leaks;

S9.从TOF模组中读取当前时刻新的RGB图、深度图和点云，跳转至步骤S3。S9. Read the new RGB image, depth image and point cloud at the current moment from the TOF module, and jump to step S3.

与现有技术相比，有益效果是：Compared with existing technology, the beneficial effects are:

1.本发明提供的一种基于深度摄像模组的物体三维重建方法，采用MD5方法编码哈希值，有效的降低了冲突次数，可以更快地实现哈希表的插入、查询和删除操作；1. The present invention provides a three-dimensional object reconstruction method based on a depth camera module, which uses the MD5 method to encode hash values, effectively reducing the number of conflicts and enabling faster insertion, query, and deletion operations in the hash table;

2.本发明提出的新的内存分配方式可以在实时扫描过程中分配新的内存，从而可以适应更大范围内的场景重建工作，提高了算法的可实用性；2. The new memory allocation method proposed by the present invention can allocate new memory during the real-time scanning process, thereby adapting to a wider range of scene reconstruction work and improving the practicability of the algorithm;

3.本发明采用了一种新的前后两帧对齐方式，计算前后两帧彩色图的orb特征点并挑选好的对应点对，利用对应点从深度图中得到前一帧的世界坐标以及后一帧的相机坐标，从而求解出后一帧的相机外参，实现前后两帧对齐，该方法比采用ICP算法能更加准确的找到对应点，减小三维重建中的漂移问题。3. The present invention adopts a new alignment method of the front and back frames, calculates the orb feature points of the front and back two frame color images and selects the corresponding point pairs, and uses the corresponding points to obtain the world coordinates of the previous frame and the following from the depth map. The camera coordinates of one frame are used to calculate the camera extrinsic parameters of the next frame and align the two frames before and after. This method can find the corresponding points more accurately than using the ICP algorithm and reduce the drift problem in three-dimensional reconstruction.

附图说明Description of the drawings

图1是本发明三维重建方法整体流程图。Figure 1 is an overall flow chart of the three-dimensional reconstruction method of the present invention.

图2是本发明MD5编码方式示意图。Figure 2 is a schematic diagram of the MD5 encoding method of the present invention.

图3是本发明TSDF计算原理图。Figure 3 is a schematic diagram of the TSDF calculation of the present invention.

具体实施方式Detailed ways

附图仅用于示例性说明，不能理解为对本发明的限制；为了更好说明本实施例，附图某些部件会有省略、放大或缩小，并不代表实际产品的尺寸；对于本领域技术人员来说，附图中某些公知结构及其说明可能省略是可以理解的。附图中描述位置关系仅用于示例性说明，不能理解为对本发明的限制。The drawings are for illustrative purposes only and should not be construed as limitations of the present invention; in order to better illustrate this embodiment, some components of the drawings may be omitted, enlarged or reduced, which do not represent the size of the actual product; for those skilled in the art It is understandable that some well-known structures and their descriptions may be omitted in the drawings. The positional relationships described in the drawings are for illustrative purposes only and should not be construed as limitations of the present invention.

实施例1：Example 1:

如图1所示，一种基于深度摄像模组的物体三维重建方法，具体包括以下步骤：As shown in Figure 1, a three-dimensional object reconstruction method based on a depth camera module specifically includes the following steps:

S1.设定当前架构下的世界观：三维重建技术本质上是建立一个足够大的空间体素集，使其将想要重建的物体包裹其中，这个母空间体素集向下又可以分为三层子结构：chunk为一级子结构，空间体素集当中包含n*n*n个chunk结构；block为二级子结构，每个chunk当中又包含m*m*m个block结构；voxel为三级子结构，每个block结构中包含t*t*t个voxel子结构；上述中的变量m、n、t均为正整数变量，可以根据具体情况进行设置；本实施例中选取的m为128，n为8，t为5，一般情况下变量越大，重构出的表面真实度会更好，但是相反会降低一定的运算速度。S1. Set the world view under the current architecture: 3D reconstruction technology essentially establishes a large enough space voxel set to wrap the object to be reconstructed. This parent space voxel set can be divided into three downwards. Layer substructure: chunk is a first-level substructure, and the spatial voxel set contains n*n*n chunk structures; block is a second-level substructure, and each chunk contains m*m*m block structures; voxel is Three-level substructure, each block structure contains t*t*t voxel substructures; the variables m, n, t mentioned above are all positive integer variables and can be set according to the specific situation; m is selected in this embodiment is 128, n is 8, and t is 5. Generally speaking, the larger the variable, the better the reconstructed surface reality will be, but on the contrary, it will reduce a certain calculation speed.

S2.建立基于TOF模组的“视频流”，从中获取其当前时间下的深度图、RGB图以及点云，并且将TOF模组的相机内参K、当前姿态下的相机外参T读取并储存到相关参数当中。S2. Establish a "video stream" based on the TOF module, obtain its depth image, RGB image and point cloud at the current time, and read the TOF module's camera internal parameters K and the camera external parameters T at the current attitude and read them. stored in relevant parameters.

S3.基于前后帧的彩色图，计算前后两帧彩色图的orb特征点并挑选好的对应点对，利用对应点从深度图中得到前一帧的世界坐标以及后一帧的相机坐标，从而求解出后一帧的相机外参，使当前帧下的点云对到标准帧(第一帧)所在的世界坐标系当中。S3. Based on the color images of the previous and next frames, calculate the ORB feature points of the two color images before and after and select the corresponding point pairs. Use the corresponding points to obtain the world coordinates of the previous frame and the camera coordinates of the next frame from the depth map, thereby Solve the camera extrinsic parameters of the next frame so that the point cloud under the current frame is aligned with the world coordinate system where the standard frame (first frame) is located.

S4.通过基于MD5编码方式的体素哈希方法在GPU设备端建立可扩展的动态哈希表；基于视锥原理将视锥范围内的block从主机端移动到设备端的哈希表当中。S4. Establish a scalable dynamic hash table on the GPU device side through the voxel hash method based on MD5 encoding; move the blocks within the view cone range from the host side to the hash table on the device side based on the frustum principle.

S5.基于当前帧的深度图并结合哈希表中的信息，挑选出其中在深度区域截断范围内有效的block，在GPU设备端中通过内存动态管理方法建立一个新的可动态分配的哈希表，并将有效的block移动到新的哈希表当中，新哈希表中记录所有有效block所对应的位置信息以及其所指向的voxel数组的位置。S5. Based on the depth map of the current frame and combined with the information in the hash table, select the blocks that are valid within the truncation range of the depth area, and create a new dynamically allocated hash on the GPU device side through the memory dynamic management method. table, and moves valid blocks to a new hash table. The new hash table records the location information corresponding to all valid blocks and the location of the voxel array it points to.

S6.遍历新哈希表中的所有block位置，并且利用cuda发射多线程遍历该block中全部voxel，根据母空间体素集在世界坐标系中起点的初始位置以及其中每个voxel的真实长度，计算该voxel的世界坐标系位置，利用反投影公式将其从世界坐标系位置投影到像素坐标系位置，判断此voxel所在世界坐标位置是否在该深度帧范围内真实可见，如果不可见则不做处理，否则利用TSDF-截断有向距离场公式更新其TSDF值以及更新当前权重，如图3所示。S6. Traverse all block positions in the new hash table, and use CUDA to launch multi-threads to traverse all voxels in the block. According to the initial position of the starting point of the parent space voxel set in the world coordinate system and the real length of each voxel, Calculate the world coordinate system position of the voxel, use the back-projection formula to project it from the world coordinate system position to the pixel coordinate system position, and determine whether the world coordinate position of the voxel is truly visible within the depth frame range. If it is not visible, do not do it. Process, otherwise use the TSDF-truncated directed distance field formula to update its TSDF value and update the current weight, as shown in Figure 3.

S7.TSDF值相当于等值面的集合，TSDF值为0的位置相当于物体表面，利用Marching Cubes技术生成三角表面网格，并对其进行渲染。S7. The TSDF value is equivalent to a collection of isosurfaces. The position where the TSDF value is 0 is equivalent to the object surface. Marching Cubes technology is used to generate a triangular surface mesh and render it.

S8.将新哈希表中所有的voxel从GPU设备端拷贝回主机端，在主机端记录该位置voxel的内容，并且释放GPU设备端的哈希表，防止造成显存泄露。S8. Copy all the voxels in the new hash table from the GPU device back to the host, record the content of the voxel at that location on the host, and release the hash table on the GPU device to prevent video memory leaks.

在本实施例中，在哈希计算当中，哈希表的键值由体素中心点的坐标通过哈希函数得到，在本实施例中，采用一种新的哈希函数方法计算得到哈希值，新的哈希函数方法包括以下步骤：In this embodiment, in the hash calculation, the key value of the hash table is obtained from the coordinates of the voxel center point through a hash function. In this embodiment, a new hash function method is used to calculate the hash value, the new hash function method includes the following steps:

如图2所示，所述的MD5转换过程具体包括以下步骤：As shown in Figure 2, the MD5 conversion process specifically includes the following steps:

S21.在一维索引数据后面添加一个1和若干0，使字节长度模512得448，数据在被填补前的长度表示在最后的64位，添补后的数据长度是512的整数倍；S21. Add a 1 and several 0s after the one-dimensional index data, so that the byte length modulo 512 is 448. The length of the data before being padded is expressed in the last 64 bits. The data length after padding is an integer multiple of 512;

S22.以512位为分组处理填补后的数据，每个分组又分为16个32位子块，使用4个32位寄存器(4个符号)循环处理子块；用四个链接变量作为参数初始化MD5，这4个变量分别为：a＝0x67452301，b＝0xefcdab89，c＝0x98badcfe，d＝0x10325476；S22. Process the padded data in groups of 512 bits. Each group is divided into 16 32-bit sub-blocks. Four 32-bit registers (4 symbols) are used to process the sub-blocks in a loop; four link variables are used as parameters to initialize MD5. , these four variables are: a=0x67452301, b=0xefcdab89, c=0x98badcfe, d=0x10325476;

S23.进行四轮循环压缩运算，每一轮有16步，每步使用一个消息，所述的消息是指以512位为一个分组的数据；利用步函数分别更新变量a、b、c、d；，步函数为Q_i+1＝Q_i+1+((Q_i-3+f_i(Q_i,Q_i-1,Q_i-2)+w_i+t_i)＜＜＜s_i)；S23. Perform four rounds of cyclic compression operation, each round has 16 steps, and each step uses a message. The message refers to data grouped into 512 bits; use step functions to update variables a, b, c, and d respectively. ;, the step function is Q _i+1 =Q _i+1 +((Q _i-3 +f _i (Q _i ,Q _i-1 ,Q _i-2 )+w _i +t _i )＜＜＜s _i );

S24.将4个32位子块级联得到一个128位值，即最终的哈希值。S24. Concatenate four 32-bit sub-blocks to obtain a 128-bit value, which is the final hash value.

在本实施例中，提出一种内存动态管理方法，现有的体素哈希的哈希表采用了预先分配内存的方法，这样虽然可以提高索引的效率，但是限制了三维重建场景的范围，本发明借鉴了c++11当中标准模板库中的hash map(哈希匹配)方式，实现了内存的动态扩展，当预分配内存用满，则动态扩展哈希表内存，实现更大范围内的三维重建。所述的动态内存管理方法包括以下步骤：在哈希表中添加一个整形变量，该变量代表当前哈希表中可用位置的数目，当该变量的值接近0的时候，停止对哈希表的插入，然后开辟同等大小的新哈希表并且将新表的表头指向旧表的表尾，然后对哈希表进行重构；假设原本哈希表的大小为n，此时新表的长度变为了2n，对之前所有已经插入的元素的哈希值模上2n得到新的桶号，并且将之前的旧表中的元素移动到扩展后的新表当中；重构完成后即可对新的哈希表进行插入操作。In this embodiment, a dynamic memory management method is proposed. The existing voxel hash hash table adopts the method of pre-allocating memory. Although this can improve the efficiency of indexing, it limits the scope of the three-dimensional reconstruction scene. This invention draws on the hash map (hash matching) method in the standard template library in C++11 to achieve dynamic expansion of memory. When the pre-allocated memory is full, the hash table memory is dynamically expanded to achieve a larger range. 3D reconstruction. The described dynamic memory management method includes the following steps: adding an integer variable in the hash table, which represents the number of available positions in the current hash table. When the value of the variable is close to 0, stop modifying the hash table. Insert, then open a new hash table of the same size and point the header of the new table to the tail of the old table, and then reconstruct the hash table; assuming that the size of the original hash table is n, the length of the new table at this time It becomes 2n. The hash value of all previously inserted elements is modulo 2n to get the new bucket number, and the elements in the old table are moved to the expanded new table; after the reconstruction is completed, the new bucket number can be obtained. Hash table for insertion operations.

显然，本发明的上述实施例仅仅是为清楚地说明本发明所作的举例，而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明权利要求的保护范围之内。Obviously, the above-mentioned embodiments of the present invention are only examples to clearly illustrate the present invention, and are not intended to limit the implementation of the present invention. For those of ordinary skill in the art, other different forms of changes or modifications can be made based on the above description. An exhaustive list of all implementations is neither necessary nor possible. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention shall be included in the protection scope of the claims of the present invention.

Claims

1. A three-dimensional reconstruction method of objects based on a depth camera module, using an algorithm based on voxel hashing to achieve three-dimensional reconstruction; its characteristic is that a new hash function method is used in the voxel hashing process to calculate Hash value, the new hash function method includes the following steps:

First, first convert the three-dimensional coordinates into a one-dimensional index through the following formula:

In the formula, Δ is the size of the current device resolution, that is, the size of one voxel;

Then, the calculated one-dimensional index data is converted into a hash value through the MD5 method; the specific steps include the following:

S1. Set the world view under the current architecture: 3D reconstruction technology essentially establishes a large enough space voxel set to wrap the object to be reconstructed. This parent space voxel set is divided into three layers downwards. Substructure: chunk is a first-level substructure, and the space voxel set contains n*n*n chunk structures; block is a second-level substructure, and each chunk contains m*m*m block structures; voxel is three Level substructure, each block structure contains t*t*t voxel substructures; the variables m, n, t in the above are all positive integer variables, which can be set according to the specific situation;

S2. Establish a "video stream" based on the TOF module, obtain its depth image, RGB image and point cloud at the current time, and read the TOF module's camera internal parameters K and the camera external parameters T at the current attitude and read them. Store it in relevant parameters;

S3. Based on the color images of the previous and next frames, calculate the ORB feature points of the two color images before and after and select the corresponding point pairs. Use the corresponding points to obtain the world coordinates of the previous frame and the camera coordinates of the next frame from the depth map, thereby Solve the camera extrinsic parameters of the next frame so that the point cloud under the current frame is aligned with the world coordinate system where the standard frame is located;

S4. Establish a scalable dynamic hash table on the GPU device side through the voxel hash method based on MD5 encoding; move the blocks within the view cone range from the host side to the hash table on the device side based on the frustum principle;

S5. Based on the depth map of the current frame and combined with the information in the hash table, select the blocks that are valid within the truncation range of the depth area, and create a new dynamically allocated hash on the GPU device side through the memory dynamic management method. table, and move valid blocks to a new hash table. The new hash table records the location information corresponding to all valid blocks and the location of the voxel array it points to;

S6. Traverse all block positions in the new hash table, and use CUDA to launch multi-threads to traverse all voxels in the block. According to the initial position of the starting point of the parent space voxel set in the world coordinate system and the real length of each voxel, Calculate the world coordinate system position of the voxel, use the back-projection formula to project it from the world coordinate system position to the pixel coordinate system position, and determine whether the world coordinate position of the voxel is truly visible within the depth frame range. If it is not visible, do not do it. Process, otherwise use the TSDF-truncated directed distance field formula to update its TSDF value and update the current weight;

S7. The TSDF value is equivalent to a collection of isosurfaces. The position where the TSDF value is 0 is equivalent to the surface of the object. MarchingCubes technology is used to generate a triangular surface mesh and render it;

S8. Copy all the voxels in the new hash table from the GPU device back to the host, record the content of the voxel at that location on the host, and release the hash table on the GPU device to prevent video memory leaks;

S9. Read the new RGB image, depth image and point cloud at the current moment from the TOF module, and jump to step S3.

2. The object three-dimensional reconstruction method based on the depth camera module according to claim 1, characterized in that when the pre-allocated memory of the hash table of voxel hash is full, a memory dynamic management method is used to realize the hash table. Dynamic expansion of memory, the dynamic memory management method includes the following steps: add an integer variable in the hash table, the variable represents the number of available locations in the current hash table, when the value of the variable is close to 0, stop Insert into the hash table, then open a new hash table of the same size and point the head of the new table to the tail of the old table, and then reconstruct the hash table; assuming that the size of the original hash table is n, this When the length of the new table becomes 2n, the hash values of all previously inserted elements are modulo 2n to obtain a new bucket number, and the elements in the old table are moved to the expanded new table; the reconstruction is completed Then you can insert the new hash table.