CN117953029B

CN117953029B - General depth map completion method and device based on depth information propagation

Info

Publication number: CN117953029B
Application number: CN202410356521.4A
Authority: CN
Inventors: 樊彬; 朱正宇; 刘红敏; 刘子熠
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2024-03-27
Filing date: 2024-03-27
Publication date: 2024-06-07
Anticipated expiration: 2044-03-27
Also published as: CN117953029A

Abstract

The invention relates to the technical field of image enhancement, in particular to a general depth map completion method and device based on depth information transmission. The general depth map completion method based on depth information propagation comprises the following steps: acquiring data of a scene by using a depth sensor to obtain a sparse depth map; acquiring data of a scene by using a color camera to obtain an RGB image; performing depth filling on the sparse depth map by adopting a pre-filling method to obtain a dense depth map; inputting the sparse depth map, the RGB map and the dense depth map into ResUNeT networks for feature extraction to obtain an affinity map; and carrying out iterative propagation according to the dense depth map and the affinity map to obtain a complement depth map. The depth map complementation method has the advantages of high complementation accuracy and high reasoning speed, and overcomes the defect of insufficient resolution of the depth sensor.

Description

A general depth map completion method and device based on depth information propagation

技术领域Technical Field

本发明涉及图像增强技术领域，特别是指一种基于深度信息传播的通用深度图补全方法及装置。The present invention relates to the field of image enhancement technology, and in particular to a general depth map completion method and device based on depth information propagation.

背景技术Background Art

随着近年来自动驾驶、机器人、增强现实等领域的迅速进步，深度信息图的重要性日益凸显。这些深度信息图对于上述任务来说，起到了至关重要的辅助作用，使得这些任务能够以非常出色的效果完成。这些深度信息图通常是通过使用商用深度传感器，如结构光传感器、ToF(time of flight)雷达等来获取。然而，商业传感器都存在各自的局限性。例如，自动驾驶领域常用的激光雷达Velodyne的HDL64E型号，其扫描结果只能得到分辨率较低的稀疏深度信息图，该稀疏深度信息图中有效深度信息像素点的数量只有对应RGB图像像素点数量的5%左右。这种稀疏程度的深度信息图虽然能够满足一些基本的诸如避障和运动物体检测的简单3d视觉任务，但是对于自动驾驶等更为复杂任务则有些力不从心。普通商用雷达的扫描结果的稀疏性极大的限制了其可靠性。With the rapid progress in the fields of autonomous driving, robotics, augmented reality, etc. in recent years, the importance of depth information maps has become increasingly prominent. These depth information maps play a vital auxiliary role in the above tasks, enabling these tasks to be completed with excellent results. These depth information maps are usually obtained by using commercial depth sensors, such as structured light sensors, ToF (time of flight) radars, etc. However, commercial sensors have their own limitations. For example, the HDL64E model of Velodyne, a commonly used laser radar in the field of autonomous driving, can only obtain a sparse depth information map with low resolution. The number of effective depth information pixels in the sparse depth information map is only about 5% of the number of pixels in the corresponding RGB image. Although this sparse depth information map can meet some basic simple 3D vision tasks such as obstacle avoidance and moving object detection, it is somewhat incapable of more complex tasks such as autonomous driving. The sparsity of the scanning results of ordinary commercial radars greatly limits their reliability.

为了克服深度传感器自身的局限性，已经有许多的研究在利用给定的稀疏深度图和对应的RGB图像来获得一个密集的深度图，这种方法被称为“深度补全”。通过深度补全技术，我们可以从稀疏的深度测量中恢复出完整的深度信息，从而提高了深度信息图的可靠性和精度。大多数深度补全方法从最开始的使用图形图像学操作（如腐蚀膨胀等操作）填充稀疏深度图中的空洞，转变为了将稀疏深度图送入卷积神经网络并直接预测稠密深度图。In order to overcome the limitations of depth sensors, there have been many studies on using a given sparse depth map and the corresponding RGB image to obtain a dense depth map. This method is called "depth completion". Through depth completion technology, we can recover complete depth information from sparse depth measurements, thereby improving the reliability and accuracy of the depth information map. Most depth completion methods have changed from using graphic imaging operations (such as corrosion and dilation) to fill holes in sparse depth maps, to feeding sparse depth maps into convolutional neural networks and directly predicting dense depth maps.

但是，卷积神经网络的预测结果会存在边缘模糊，为了解决以上问题，卷积空间传播网络（CSPN）被提出以用于对预测出来的深度图进行精炼化处理，以获取锐利深度图边缘和更精确的深度信息。该方法通常会预测每个像素点和周围点之间的亲和力系数，然后使用该系数对已经深度信息进行多次深度信息传播，最终获得稠密的深度信息图。对于这种基于空间传播的深度补全神经网络框架，网络的输入是稀疏深度图和对应的RGB图像，网络的输出是传播过程的初始深度图和传播使用的亲和力系数图。However, the prediction results of convolutional neural networks will have blurred edges. In order to solve the above problems, the convolutional spatial propagation network (CSPN) was proposed to refine the predicted depth map to obtain sharp depth map edges and more accurate depth information. This method usually predicts the affinity coefficient between each pixel and the surrounding points, and then uses this coefficient to propagate the depth information multiple times to finally obtain a dense depth information map. For this spatial propagation-based depth completion neural network framework, the input of the network is a sparse depth map and the corresponding RGB image, and the output of the network is the initial depth map of the propagation process and the affinity coefficient map used for propagation.

然而，这种框架其实是为了应对深度图扫描结果的稀疏性问题的一种折中，网络同时承担了预测稠密深度图和亲和力系数的任务，但这两个任务都缺少直接监督，导致神经网络对两个任务的学习都不充分，进而导致模型的泛化能力下降，最终导致补全后的深度图的可信度较低。现有的空间传播神经网络结构对于亲和力生成分支的设计都较为复杂，并且在传播过程中使用了较大的传播次数，导致使用此类基于空间传播的深度补全方法的计算量都很大，最终导致其使用时候的推理速度过慢。However, this framework is actually a compromise to deal with the sparsity problem of depth map scanning results. The network is responsible for predicting dense depth maps and affinity coefficients at the same time, but both tasks lack direct supervision, resulting in insufficient learning of the neural network for both tasks, which in turn leads to a decrease in the generalization ability of the model, and ultimately leads to low credibility of the completed depth map. The existing spatial propagation neural network structure has a relatively complex design for the affinity generation branch, and uses a large number of propagation times during the propagation process, resulting in a large amount of computation for using such depth completion methods based on spatial propagation, which ultimately leads to a slow inference speed when used.

在现有技术中，缺乏一种克服深度传感器分辨率不足的补全精度高、推理速度快的深度图补全方法。In the prior art, there is a lack of a depth map completion method with high completion accuracy and fast inference speed that overcomes the insufficient resolution of the depth sensor.

发明内容Summary of the invention

为了解决现有技术存在的深度传感器分辨率不足而导致其获取到的深度信息度过于稀疏、测量不准以及现有的基于空间传播的深度补全方法的精度不高且计算量大的技术问题，本发明实施例提供了一种基于深度信息传播的通用深度图补全方法及装置。所述技术方案如下：In order to solve the technical problems in the prior art that the depth information obtained by the depth sensor is too sparse and the measurement is inaccurate due to insufficient resolution of the depth sensor, and the existing depth completion method based on spatial propagation has low precision and large calculation amount, the embodiment of the present invention provides a general depth map completion method and device based on depth information propagation. The technical solution is as follows:

一方面，提供了一种基于深度信息传播的通用深度图补全方法，该方法由通用深度图补全设备实现，该方法包括：On the one hand, a general depth map completion method based on depth information propagation is provided, the method is implemented by a general depth map completion device, and the method includes:

使用深度传感器对场景进行数据采集，获得稀疏深度图；使用彩色相机对场景进行数据采集，获得RGB图；Use a depth sensor to collect data on the scene and obtain a sparse depth map; use a color camera to collect data on the scene and obtain an RGB map;

采用预填充方法，对所述稀疏深度图进行深度填充，获得稠密深度图；Using a pre-filling method, depth-filling the sparse depth map to obtain a dense depth map;

将所述稀疏深度图、所述RGB图和所述稠密深度图，输入ResUNeT网络进行特征提取，获得亲和力图；Inputting the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map;

根据所述稠密深度图以及所述亲和力图进行迭代传播，获得补全深度图。Iterative propagation is performed according to the dense depth map and the affinity map to obtain a completed depth map.

其中，所述预填充方法为卷积空间传播网络、非局部空间传播网络、密集空间传播网络或全卷积空间传播网络。Among them, the pre-filling method is a convolutional spatial propagation network, a non-local spatial propagation network, a dense spatial propagation network or a fully convolutional spatial propagation network.

其中，所述亲和力图包括第一亲和力图、第二亲和力图和第三亲和力图；Wherein, the affinity map includes a first affinity map, a second affinity map and a third affinity map;

所述第一亲和力图用于结构信息补全；所述第一亲和力图的尺寸为所述稠密深度图的尺寸的十六分之一；所述稠密深度图的尺寸为，m为所述稠密深度图的长，n为所述稠密深度图的宽；The first affinity map is used for structural information completion; the size of the first affinity map is one sixteenth of the size of the dense depth map; the size of the dense depth map is , m is the length of the dense depth map, n is the width of the dense depth map;

所述第二亲和力图用于细节信息补全；所述第二亲和力图的尺寸为所述稠密深度图的尺寸的四分之一；The second affinity map is used for detail information completion; the size of the second affinity map is one quarter of the size of the dense depth map;

所述第三亲和力图用于细节信息补全；所述第三亲和力图的尺寸为所述稠密深度图的尺寸。The third affinity map is used to complete detail information; the size of the third affinity map is the size of the dense depth map.

其中，所述ResUNeT网络包括特征提取分支以及亲和力图生成分支；Wherein, the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;

特征提取分支为编码器-解码器结构；编码器结构包括5个卷积层；解码器结构包括4个反卷积层；The feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolutional layers;

所述亲和力图生成分支包括第一亲和力图生成分支、第二亲和力图生成分支和第三亲和力图生成分支；所述亲和力图生成分支包括2个反卷积层以及1个卷积层。The affinity map generation branch includes a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generation branch includes 2 deconvolution layers and 1 convolution layer.

可选地，所述将所述稀疏深度图、所述RGB图和所述稠密深度图，输入ResUNeT网络进行特征提取，获得亲和力图，包括：Optionally, the step of inputting the sparse depth map, the RGB map and the dense depth map into a ResUNeT network for feature extraction to obtain an affinity map comprises:

根据所述稀疏深度图、所述RGB图和所述稠密深度图，通过ResUNeT网络进行特征提取，获得第一图像特征、第二图像特征和第三图像特征；According to the sparse depth map, the RGB map and the dense depth map, feature extraction is performed through a ResUNeT network to obtain a first image feature, a second image feature and a third image feature;

所述第一图像特征包括第一稀疏深度图特征、第一RGB图特征和第一稠密深度图特征；所述第一图像特征的尺寸为所述稠密深度图的尺寸的十六分之一；The first image feature includes a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; a size of the first image feature is one sixteenth of a size of the dense depth map;

所述第二图像特征包括第二稀疏深度图特征、第二RGB图特征和第二稠密深度图特征；所述第二图像特征的尺寸为所述稠密深度图的尺寸的四分之一；The second image feature includes a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the size of the second image feature is one quarter of the size of the dense depth map;

所述第三图像特征包括第三稀疏深度图特征、第三RGB图特征和第三稠密深度图特征；所述第三图像特征的尺寸为所述稠密深度图的尺寸；The third image feature includes a third sparse depth map feature, a third RGB map feature and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;

根据所述第一图像特征进行卷积操作，获得第一亲和力图；Performing a convolution operation according to the first image feature to obtain a first affinity map;

根据所述第二图像特征进行卷积操作，获得第二亲和力图；Performing a convolution operation according to the second image feature to obtain a second affinity map;

根据所述第三图像特征进行卷积操作，获得第三亲和力图。A convolution operation is performed according to the third image feature to obtain a third affinity map.

可选地，所述根据所述稠密深度图以及所述亲和力图进行迭代传播，获得补全深度图，包括：Optionally, performing iterative propagation according to the dense depth map and the affinity map to obtain a completed depth map includes:

对所述稠密深度图进行下采样处理，获得第一稠密深度图；基于第一亲和力图，将所述第一稠密深度图进行迭代传播，获得第一补全深度图；Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first completed depth map;

对所述第一补全深度图进行上采样处理，获得第二稠密深度图；基于第二亲和力图，将所述第二稠密深度图进行迭代传播，获得第二补全深度图；Performing upsampling processing on the first completed depth map to obtain a second dense depth map; and iteratively propagating the second dense depth map based on the second affinity map to obtain a second completed depth map;

对所述第二补全深度图进行上采样处理，获得第三稠密深度图；基于第三亲和力图，将所述第三稠密深度图进行迭代传播，获得补全深度图。The second completed depth map is upsampled to obtain a third dense depth map; based on the third affinity map, the third dense depth map is iteratively propagated to obtain a completed depth map.

另一方面，提供了一种基于深度信息传播的通用深度图补全装置，该装置应用于基于深度信息传播的通用深度图补全方法，该装置包括：On the other hand, a general depth map completion device based on depth information propagation is provided, and the device is applied to a general depth map completion method based on depth information propagation, and the device includes:

数据采集模块，用于使用深度传感器对场景进行数据采集，获得稀疏深度图；使用彩色相机对场景进行数据采集，获得RGB图；The data acquisition module is used to collect data of the scene using a depth sensor to obtain a sparse depth map; and to collect data of the scene using a color camera to obtain an RGB map;

预填充模块，用于采用预填充方法，对所述稀疏深度图进行深度填充，获得稠密深度图；A pre-filling module, configured to perform depth filling on the sparse depth map by using a pre-filling method to obtain a dense depth map;

亲和力图生成模块，用于将所述稀疏深度图、所述RGB图和所述稠密深度图，输入ResUNeT网络进行特征提取，获得亲和力图；An affinity map generation module, used for inputting the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map;

深度图补全模块，用于根据所述稠密深度图以及所述亲和力图进行迭代传播，获得补全深度图。The depth map completion module is used to perform iterative propagation according to the dense depth map and the affinity map to obtain a completed depth map.

可选地，所述亲和力图生成模块，进一步用于：Optionally, the affinity graph generating module is further used to:

可选地，所述深度图补全模块，进一步用于：Optionally, the depth map completion module is further used to:

另一方面，提供一种通用深度图补全设备，所述通用深度图补全设备包括：处理器；存储器，所述存储器上存储有计算机可读指令，所述计算机可读指令被所述处理器执行时，实现如上述基于深度信息传播的通用深度图补全方法中的任一项方法。On the other hand, a general depth map completion device is provided, the general depth map completion device comprising: a processor; a memory, the memory storing computer-readable instructions, and when the computer-readable instructions are executed by the processor, any one of the general depth map completion methods based on depth information propagation as described above is implemented.

另一方面，提供了一种计算机可读存储介质，所述存储介质中存储有至少一条指令，所述至少一条指令由处理器加载并执行以实现上述基于深度信息传播的通用深度图补全方法中的任一项方法。On the other hand, a computer-readable storage medium is provided, wherein at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement any one of the above-mentioned general depth map completion methods based on depth information propagation.

本发明实施例提供的技术方案带来的有益效果至少包括：The beneficial effects brought about by the technical solution provided by the embodiment of the present invention include at least:

本发明提出一种基于深度信息传播的通用深度图补全方法，通过可替换的预填充方法，使本发明提出方法能够在任意预填充方法上获得精度提升，预填充方法精度越高，最终精良的精度也越高。因此，本发明提出方法即使在后续新的深度补全方法出现之后，也可以把它作为预填充方法并再其的基础上获得精度更高的深度补全结果。本发明提出一种新的亲和力图生成方法，根据此方法生成的亲和力图解决了传播范围较大，导致推理速度过慢的问题。由于生成三个阶段的亲和力分别用的是对应尺度的特征图，这三个尺度的特征图则分别对应着从抽象的结构信息到具体的细节信息。本发明整体的空间传播网络框架，大大减小了模型训练得到的网络泛化性差的风险；由于监督信息的参与，降低学习训练过程的难度。本发明是一种克服深度传感器分辨率不足的补全精度高、推理速度快的深度图补全方法。The present invention proposes a general depth map completion method based on deep information propagation. Through a replaceable pre-filling method, the method proposed by the present invention can obtain accuracy improvement on any pre-filling method. The higher the accuracy of the pre-filling method, the higher the final precision. Therefore, even after the emergence of a new depth completion method, the method proposed by the present invention can be used as a pre-filling method and obtain a more accurate depth completion result on its basis. The present invention proposes a new affinity graph generation method. The affinity graph generated by this method solves the problem that the propagation range is large and the reasoning speed is too slow. Since the affinity of the three stages is generated using feature maps of corresponding scales, the feature maps of these three scales correspond to information from abstract structural information to specific detail information. The overall spatial propagation network framework of the present invention greatly reduces the risk of poor generalization of the network obtained by model training; due to the participation of supervisory information, the difficulty of the learning and training process is reduced. The present invention is a depth map completion method with high completion accuracy and fast reasoning speed that overcomes the insufficient resolution of the depth sensor.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work.

图1是本发明实施例提供的一种基于深度信息传播的通用深度图补全方法流程图；FIG1 is a flow chart of a general depth map completion method based on depth information propagation provided by an embodiment of the present invention;

图2是本发明实施例提供的一种基于深度信息传播的通用深度图补全装置框图；FIG2 is a block diagram of a general depth map completion device based on depth information propagation provided by an embodiment of the present invention;

图3是本发明实施例提供的一种通用深度图补全设备的结构示意图。FIG. 3 is a schematic structural diagram of a general depth map completion device provided by an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图，对本发明中的技术方案进行描述。The technical solution of the present invention is described below in conjunction with the accompanying drawings.

在本发明实施例中，“示例地”、“例如”等词用于表示作例子、例证或说明。本发明中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言，使用示例的一词旨在以具体方式呈现概念。此外，在本发明实施例中，“和/或”所表达的含义可以是两者都有，或者可以是两者任选其一。In the embodiments of the present invention, words such as "exemplarily" and "for example" are used to indicate examples, illustrations or explanations. Any embodiment or design described as "example" in the present invention should not be interpreted as being more preferred or more advantageous than other embodiments or designs. Specifically, the use of the word "example" is intended to present the concept in a specific way. In addition, in the embodiments of the present invention, the meaning expressed by "and/or" can be both, or it can be either of the two.

本发明实施例中，“图像”，“图片”有时可以混用，应当指出的是，在不强调其区别时，其所要表达的含义是一致的。“的(of)”，“相应的（corresponding，relevant）”和“对应的(corresponding)”有时可以混用，应当指出的是，在不强调其区别时，其所要表达的含义是一致的。In the embodiments of the present invention, "image" and "picture" can sometimes be used interchangeably. It should be noted that when the difference between them is not emphasized, the meanings they intend to express are the same. "of", "corresponding, relevant" and "corresponding" can sometimes be used interchangeably. It should be noted that when the difference between them is not emphasized, the meanings they intend to express are the same.

本发明实施例中，有时候下标如W₁可能会表示为非下标的形式如W1，在不强调其区别时，其所要表达的含义是一致的。In the embodiments of the present invention, sometimes a subscript such as _W1 may be expressed in a non-subscript form such as W1. When the difference is not emphasized, the meanings to be expressed are consistent.

为使本发明要解决的技术问题、技术方案和优点更加清楚，下面将结合附图及具体实施例进行详细描述。In order to make the technical problems, technical solutions and advantages to be solved by the present invention more clear, a detailed description will be given below with reference to the accompanying drawings and specific embodiments.

本发明实施例提供了一种基于深度信息传播的通用深度图补全方法，该方法可以由通用深度图补全设备实现，该通用深度图补全设备可以是终端或服务器。如图1所示的基于深度信息传播的通用深度图补全方法流程图，该方法的处理流程可以包括如下的步骤：The embodiment of the present invention provides a general depth map completion method based on depth information propagation, which can be implemented by a general depth map completion device, which can be a terminal or a server. As shown in FIG1 , the general depth map completion method based on depth information propagation process flow of the method may include the following steps:

S1、使用深度传感器对场景进行数据采集，获得稀疏深度图；使用彩色相机对场景进行数据采集，获得RGB图。S1. Use a depth sensor to collect data on the scene to obtain a sparse depth map; use a color camera to collect data on the scene to obtain an RGB map.

一种可行的实施方式中，使用深度传感器和彩色相机分别采集一个场景中的稀疏深度图和RGB图。其中m为稀疏深度图的图像的长，n为稀疏深度图的图像的宽，下标S表示稀疏（sparse）。In one feasible implementation, a sparse depth map of a scene is collected using a depth sensor and a color camera. and RGB images Where m is the length of the sparse depth map image, n is the width of the sparse depth map image, and the subscript S represents sparse.

S2、采用预填充方法，对稀疏深度图进行深度填充，获得稠密深度图。S2. Use the pre-filling method to perform depth filling on the sparse depth map to obtain a dense depth map.

一种可行的实施方式中，本发明提出的解耦分层式卷积空间传播网络框架（Decoupled Hierarchical-Convolutional Spatial Propagation Network，DH-CSPN），用以对稀疏的深度扫描结果进行补全。此框架与传统的基于空间传播的深度补全网络框架，如卷积空间传播网络（Convolutional Spatial Propagation Network，CSPN）存在很大区别。本框架使用的深度补全方法的对后续的空间传播过程进行了解耦，传播过程的初始值（即深度补全的初始深度图），现在直接使用预填充方法获得，在原本的CSPN框架中，初始深度图的生成是由网络自己生成的。In a feasible implementation, the present invention proposes a decoupled hierarchical-convolutional spatial propagation network framework (Decoupled Hierarchical-Convolutional Spatial Propagation Network, DH-CSPN) to complete the sparse depth scanning results. This framework is very different from the traditional depth completion network framework based on spatial propagation, such as the convolutional spatial propagation network (CSPN). The depth completion method used in this framework decouples the subsequent spatial propagation process. The initial value of the propagation process (i.e., the initial depth map of depth completion) is now directly obtained using the pre-filling method. In the original CSPN framework, the initial depth map is generated by the network itself.

CSPN框架下的初始深度图和亲和力图都是没有监督信息对其进行监督的，这会导致两者的学习难度变大，这也是模型学习到泛化能力较差结果的直接原因。本发明提出的框架可以通过预填充的方法固定传播迭代过程中的初始稠密深度图，进而降低卷积神经网络的负担，使得卷积神经网络专注于亲和力的预测上，最终学习出泛化能力更强的亲和力预测方法，获得了个精度更高的深度补全结果。The initial depth map and affinity map under the CSPN framework are not supervised by supervisory information, which makes the learning difficulty of both of them greater, which is also the direct reason why the model learns poor generalization results. The framework proposed in the present invention can fix the initial dense depth map in the propagation iteration process by pre-filling method, thereby reducing the burden of the convolutional neural network, so that the convolutional neural network focuses on the prediction of affinity, and finally learns an affinity prediction method with stronger generalization ability, and obtains a more accurate depth completion result.

在该步骤中获得与稀疏深度图同样大小的稠密深度图,其中下标D代表稠密(dense)。In this step, a dense depth map of the same size as the sparse depth map is obtained. , where the subscript D stands for dense.

其中，预填充方法为卷积空间传播网络、非局部空间传播网络、密集空间传播网络或全卷积空间传播网络。Among them, the pre-filling method is a convolutional spatial propagation network, a non-local spatial propagation network, a dense spatial propagation network or a fully convolutional spatial propagation network.

一种可行的实施方式中，本发明提出的预填充方法为任意的深度补全方法，在获得其他深度补全方法的补全结果的前提下，本发明的方法通过将该方法补全后的深度图送入本发明的深度补全网络框架中，对前述补全结果精炼化操作，最终获得精度更高的深度补全结果。这使得本发明可以成为一种通用的深度补全方法，可以接在任何深度补全方法的后端，对该方法进行精炼化后处理。In a feasible implementation, the pre-filling method proposed by the present invention is an arbitrary depth completion method. On the premise of obtaining the completion results of other depth completion methods, the method of the present invention refines the completion results by sending the depth map completed by the method into the depth completion network framework of the present invention, and finally obtains a more accurate depth completion result. This makes the present invention a universal depth completion method, which can be connected to the back end of any depth completion method to perform refined post-processing on the method.

预填充方法可以使用非常简单的图像处理膨胀操作，甚至不作任何操作，而更复杂化的，本框架的预填充方法可以替换为最新发布的各种深度补全方法。The pre-filling method can use very simple image processing dilation operations, or even no operation, while the pre-filling method of this framework can be replaced with various newly released depth completion methods for more complexity.

S3、将稀疏深度图、RGB图和稠密深度图，输入ResUNeT网络进行特征提取，获得亲和力图。S3. Input the sparse depth map, RGB map and dense depth map into the ResUNeT network for feature extraction to obtain the affinity map.

其中，亲和力图包括第一亲和力图、第二亲和力图和第三亲和力图；The affinity map includes a first affinity map, a second affinity map and a third affinity map;

第一亲和力图用于结构信息补全；第一亲和力图的尺寸为稠密深度图的尺寸的十六分之一；稠密深度图的尺寸为，m为稠密深度图的长，n为稠密深度图的宽；The first affinity map is used to complete the structural information; the size of the first affinity map is one sixteenth of the size of the dense depth map; the size of the dense depth map is , m is the length of the dense depth map, n is the width of the dense depth map;

第二亲和力图用于细节信息补全；第二亲和力图的尺寸为稠密深度图的尺寸的四分之一；The second affinity map is used to complete the detail information; the size of the second affinity map is one quarter of the size of the dense depth map;

第三亲和力图用于细节信息补全；第三亲和力图的尺寸为稠密深度图的尺寸。The third affinity map is used to complete the detail information; the size of the third affinity map is the size of the dense depth map.

一种可行的实施方式中，将稀疏深度图、RGB图和预填充后得到的稠密深度信息图送入残差卷积神经网络（Residual U-shaped Network，ResUNet）结构的骨干网络中，对上述三种图像进行特征提取，并最终生成三个不同大小的亲和力图。In one feasible implementation, the sparse depth map , RGB image And the dense depth information map obtained after pre-filling The three images are sent to the backbone network of the Residual U-shaped Network (ResUNet) structure to extract features of the above three images and finally generate three affinity maps of different sizes.

亲和力图的大小分别为，，，k为亲和力传播范围，数值设置为3，亲和力图的传播范围为的正方形范围。亲和力图是深度图上每个像素点和周围的相关系数，在实际深度图迭代传播过程中的意义在于，控制传播过程中向某个方向传播深度值的系数。对于本发明采用的的亲和力图，其控制的是大小的深度图中每个点，向该点周围范围内传播深度信息时的系数。The sizes of the affinity graphs are , , , k is the affinity propagation range, the value is set to 3, and the propagation range of the affinity graph is The affinity map is the correlation coefficient between each pixel on the depth map and its surroundings. Its significance in the actual iterative propagation process of the depth map is to control the coefficient of the depth value propagated in a certain direction during the propagation process. The affinity diagram of For each point in the depth map of size, The coefficient when propagating depth information within the range.

其中，ResUNeT网络包括特征提取分支以及亲和力图生成分支；Among them, the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;

亲和力图生成分支包括第一亲和力图生成分支、第二亲和力图生成分支和第三亲和力图生成分支；亲和力图生成分支包括2个反卷积层以及1个卷积层。The affinity map generation branch includes a first affinity map generation branch, a second affinity map generation branch and a third affinity map generation branch; the affinity map generation branch includes 2 deconvolution layers and 1 convolution layer.

一种可行的实施方式中，ResUNet结构设计分为编码器部分和解码器部分，编码器结构设计5个卷积层，前4层卷积层（conv2-conv5）与ResUNet34的第一层结构相同，最后一层卷积层conv6的卷积核大小为，步长为2，通道数为512，采用ReLU作为激活函数。In a feasible implementation, the ResUNet structure is designed to be divided into an encoder part and a decoder part. The encoder structure is designed with 5 convolutional layers. The first 4 convolutional layers (conv2-conv5) are the same as the first layer of ResUNet34. The convolution kernel size of the last convolutional layer conv6 is , the step size is 2, the number of channels is 512, and ReLU is used as the activation function.

解码器结构设计4个反卷积层，反卷积层dec5的反卷积的卷积核大小为，步长为2，通道数为256；反卷积层dec4的反卷积的卷积核大小为，步长为2，通道数为128，该反卷积层将反卷积层dec5的输出与卷积层conv5的输出连接；反卷积层dec3的反卷积的卷积核大小为，步长为2，通道数为64，该反卷积层将卷积层conv4的输出与反卷积层dec4的输出连接；反卷积层dec2的反卷积的卷积核大小为，步长为2，通道数为64，该反卷积层将卷积层conv3和反卷积层dec3的输出进行连接。The decoder structure is designed with 4 deconvolution layers, and the deconvolution kernel size of the deconvolution layer dec5 is , the step size is 2, the number of channels is 256; the convolution kernel size of the deconvolution layer dec4 is , the step size is 2, the number of channels is 128, and the deconvolution layer connects the output of the deconvolution layer dec5 with the output of the convolution layer conv5; the deconvolution kernel size of the deconvolution layer dec3 is , the step size is 2, the number of channels is 64, and the deconvolution layer connects the output of the convolution layer conv4 with the output of the deconvolution layer dec4; the deconvolution kernel size of the deconvolution layer dec2 is , the step size is 2, the number of channels is 64, and the deconvolution layer connects the outputs of the convolution layer conv3 and the deconvolution layer dec3.

可选地，根据稀疏深度图、RGB图和稠密深度图，通过ResUNeT网络进行特征提取，获得亲和力图，包括：Optionally, feature extraction is performed through a ResUNeT network according to the sparse depth map, the RGB map, and the dense depth map to obtain an affinity map, including:

根据稀疏深度图、RGB图和稠密深度图，通过ResUNeT网络进行特征提取，获得第一图像特征、第二图像特征和第三图像特征；According to the sparse depth map, the RGB map and the dense depth map, feature extraction is performed through the ResUNeT network to obtain the first image feature, the second image feature and the third image feature;

第一图像特征包括第一稀疏深度图特征、第一RGB图特征和第一稠密深度图特征；第一图像特征的尺寸为稠密深度图的尺寸的十六分之一；The first image feature includes a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; the size of the first image feature is one sixteenth of the size of the dense depth map;

第二图像特征包括第二稀疏深度图特征、第二RGB图特征和第二稠密深度图特征；第二图像特征的尺寸为稠密深度图的尺寸的四分之一；The second image feature includes a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the size of the second image feature is one quarter of the size of the dense depth map;

第三图像特征包括第三稀疏深度图特征、第三RGB图特征和第三稠密深度图特征；第三图像特征的尺寸为稠密深度图的尺寸；The third image feature includes a third sparse depth map feature, a third RGB map feature, and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;

根据第一图像特征进行卷积操作，获得第一亲和力图；Performing a convolution operation according to the first image feature to obtain a first affinity map;

根据第二图像特征进行卷积操作，获得第二亲和力图；Performing a convolution operation according to the second image feature to obtain a second affinity map;

根据第三图像特征进行卷积操作，获得第三亲和力图。A convolution operation is performed according to the third image feature to obtain a third affinity map.

一种可行的实施方式中，亲和力图本质上属于将初始深度图转化为更精细化的深度图的一种映射，而这个映射的表达能力强弱与否自然就会影响最终精细化结果的精度高低。In a feasible implementation, the affinity map is essentially a mapping that converts the initial depth map into a more refined depth map, and the expressiveness of this mapping will naturally affect the accuracy of the final refined result.

根据动态亲和力范围的概念，本发明的第一亲和力图与第二亲和力图的传播范围为的正方形3范围，但是由于是在1/16和1/4范围内进行的，所以相对于原版就能分别聚合16倍和4倍范围内的深度信息。According to the concept of dynamic affinity range, the propagation range of the first affinity map and the second affinity map of the present invention is The depth information of the original image can be aggregated within the range of 16 times and 4 times, respectively, because it is performed within the range of 1/16 and 1/4.

生成三个阶段的第一亲和力图、第二亲和力图和第三亲和力图，分别对应三个尺度的特征图，而这三个尺度的特征图则分别对应着从抽象的结构信息到具体的细节信息，因此，在分辨率较低的第一阶段，DH-CSPN的第一亲和力图能够更加注重结构信息，而在最后阶段生成的第三亲和力图亲和力则能够更加注重细节信息，此处设计基于动态亲和力并引入权重的概念。The first affinity map, the second affinity map and the third affinity map generated in the three stages correspond to feature maps of three scales respectively, and the feature maps of these three scales correspond to the abstract structural information to the specific detail information respectively. Therefore, in the first stage with lower resolution, the first affinity map of DH-CSPN can pay more attention to the structural information, while the affinity of the third affinity map generated in the final stage can pay more attention to the detail information. The design here is based on dynamic affinity and introduces the concept of weight.

生成第一亲和力图的网络框架结构包括反卷积层gd2-dec1、反卷积层gd2-dec0和卷积层gd2-conf0。反卷积层gd2-dec1的反卷积的卷积核大小为，步长为2，通道数为128，该层将卷积层conv4和反卷积层dec4的输出进行连接；反卷积层gd2-dec0的反卷积的卷积核大小为，步长为1，通道数为；卷积层gd2-conf0的卷积核大小为，步长为1，通道数为，采用Sigmoid作为激活函数。The network framework structure for generating the first affinity graph includes deconvolution layer gd2-dec1, deconvolution layer gd2-dec0 and convolution layer gd2-conf0. The deconvolution kernel size of deconvolution layer gd2-dec1 is , the step size is 2, the number of channels is 128, this layer connects the outputs of the convolution layer conv4 and the deconvolution layer dec4; the deconvolution kernel size of the deconvolution layer gd2-dec0 is , the step size is 1, and the number of channels is ; The convolution kernel size of the convolution layer gd2-conf0 is , the step size is 1, and the number of channels is , Sigmoid is used as the activation function.

生成第二亲和力图的网络框架结构包括反卷积层gd1-dec1、反卷积层gd1-dec0和卷积层gd1-conf0。反卷积层gd1-dec1的反卷积的卷积核大小为，步长为1，通道数为，该层将卷积层conv3和反卷积层dec3的输出进行连接；反卷积层gd1-dec0的反卷积的卷积核大小为，步长为1，通道数为；卷积层gd2-conf0的卷积核大小为，步长为1，通道数为，采用Sigmoid作为激活函数。The network framework structure for generating the second affinity graph includes a deconvolution layer gd1-dec1, a deconvolution layer gd1-dec0, and a convolution layer gd1-conf0. The deconvolution kernel size of the deconvolution layer gd1-dec1 is , the step size is 1, and the number of channels is , which connects the outputs of the convolution layer conv3 and the deconvolution layer dec3; the deconvolution kernel size of the deconvolution layer gd1-dec0 is , the step size is 1, and the number of channels is ; The convolution kernel size of the convolution layer gd2-conf0 is , the step size is 1, and the number of channels is , Sigmoid is used as the activation function.

生成第三亲和力图的网络框架结构包括反卷积层gd0-dec1、连接层concat2、反卷积层gd0-dec0和卷积层gd0-conf0。反卷积层gd0-dec1的反卷积的卷积核大小为，步长为2，通道数为64，该层将卷积层conv2和反卷积层dec2的输出进行连接；连接层concat2将gd0-dec1、dec1和concat1的输出进行链接；反卷积层gd0-dec0的反卷积的卷积核大小为，步长为1，通道数为；卷积层gd0-conf0的卷积核大小为，步长为1，通道数为，采用Sigmoid作为激活函数。The network framework structure for generating the third affinity graph includes a deconvolution layer gd0-dec1, a connection layer concat2, a deconvolution layer gd0-dec0, and a convolution layer gd0-conf0. The deconvolution kernel size of the deconvolution layer gd0-dec1 is , the step size is 2, the number of channels is 64, this layer connects the outputs of the convolution layer conv2 and the deconvolution layer dec2; the connection layer concat2 links the outputs of gd0-dec1, dec1 and concat1; the deconvolution kernel size of the deconvolution layer gd0-dec0 is , the step size is 1, and the number of channels is ; The convolution kernel size of the convolution layer gd0-conf0 is , the step size is 1, and the number of channels is , Sigmoid is used as the activation function.

由于在后续迭代传播过程中的初始阶段，深度图像被下采样导致了细节信息损失而只保留了结构性信息，正是由于1/16大小的深度信息图只包含了结构信息，本发明的第一亲和力图将会更加专注于这一方面。类似的，在第三阶段的1/1的结果中，本方法生成的第二亲和力图则会更加专注于细节信息。Since the depth image is downsampled in the initial stage of the subsequent iterative propagation process, resulting in loss of detail information and only retaining structural information, the first affinity map of the present invention will be more focused on this aspect because the 1/16 size depth information map only contains structural information. Similarly, in the 1/1 result of the third stage, the second affinity map generated by the method will be more focused on detail information.

分层式设计的三种尺度结构使得前两阶段的亲和力生成和深度值迭代传播过程都是在较小的尺度上进行的，对应的计算量也是原本的1/16和1/4，这使得本文的方法在计算量方面相对于非局部空间传播网络（Non-local Spatial Propagation Network，NLSPN）和动态空间传播网络（Dynamic Spatial Propagation Network，DySPN）更有优势。The three-scale structure of the hierarchical design enables the affinity generation and depth value iterative propagation processes in the first two stages to be carried out on a smaller scale, and the corresponding computational complexity is also 1/16 and 1/4 of the original. This makes the method in this paper more advantageous in terms of computational complexity than the Non-local Spatial Propagation Network (NLSPN) and the Dynamic Spatial Propagation Network (DySPN).

S4、根据稠密深度图以及亲和力图进行迭代传播，获得补全深度图。S4. Perform iterative propagation based on the dense depth map and affinity map to obtain the completed depth map.

可选地，根据稠密深度图以及亲和力图进行迭代传播，获得补全深度图，包括：Optionally, iterative propagation is performed according to the dense depth map and the affinity map to obtain a completed depth map, including:

对稠密深度图进行下采样处理，获得第一稠密深度图；基于第一亲和力图，将第一稠密深度图进行迭代传播，获得第一补全深度图；Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first completed depth map;

对第一补全深度图进行上采样处理，获得第二稠密深度图；基于第二亲和力图，将第二稠密深度图进行迭代传播，获得第二补全深度图；Upsampling the first completed depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on the second affinity map to obtain a second completed depth map;

对第二补全深度图进行上采样处理，获得第三稠密深度图；基于第三亲和力图，将第三稠密深度图进行迭代传播，获得补全深度图。The second completed depth map is upsampled to obtain a third dense depth map; based on the third affinity map, the third dense depth map is iteratively propagated to obtain a completed depth map.

一种可行的实施方式中，在获得第一亲和力图和补全后的稠密深度图后，将原大小为的稠密深度图长宽各下采样到原来的1/4，可以表示为。根据尺寸大小相同的第一亲和力图和深度图，用第一亲和力图指导深度图进行深度迭代传播。此深度值第一阶段迭代传播的数学表达式如下式（1）：In a feasible implementation manner, after obtaining the first affinity map After the dense depth map is completed, the original size is The length and width of the dense depth map of is downsampled to 1/4 of the original, which can be expressed as . Based on the first affinity graph of the same size and depth map , use the first affinity map to guide the depth map to perform iterative depth propagation. The mathematical expression of the first stage iterative propagation of the depth value is as follows (1):

（1） (1)

其中,i为深度图像素点中的每个像素的行位置，j为深度图像素点中的每个像素的列位置，下标N代表的是迭代次数的序号，代表的按位相乘，（a，b）表示传播范围内的每个点的相对位置。Where i is the row position of each pixel in the depth map, j is the column position of each pixel in the depth map, and the subscript N represents the number of iterations. Represents the bitwise multiplication of, (a, b) represents The relative position of each point within the propagation range.

由式（1）可知，根据迭代前的深度图的对应位置周围范围内的点，通过亲和力图进行加权求和计算，得到深度值迭代传播后的深度图中每个像素点的深度值。在第一阶段迭代传播过程中，在1/16原图大小的深度信息图上进行的深度值迭代传播过程将会持续6次，最终获得。From formula (1), we can see that according to the depth map before iteration Around the corresponding position Points within range, through affinity graph Perform weighted sum calculation to obtain the depth map after iterative propagation of depth values In the first stage of iterative propagation, the iterative propagation process of the depth value on the depth information map of 1/16 original image size will be continued for 6 times, and finally the depth value of each pixel point in the image is obtained. .

根据第一阶段迭代传播结果进行上采样操作，将其长宽都变为原本的2倍，即。根据和对应大小的第二亲和力图进行深度值迭代传播，此第二阶段传播过程的数学表达式如下式（2）：According to the results of the first phase of iterative propagation Perform upsampling to double the length and width of the original. .according to and the second affinity map of the corresponding size The depth value is iteratively propagated. The mathematical expression of this second stage propagation process is as follows:

（2） (2)

经过6次的迭代过程得到。After 6 iterations, we get .

根据第二阶段迭代传播结果进行上采样操作，将其长宽都变为原本的2倍，即。根据和对应大小的第三亲和力图进行深度值迭代传播，此第三阶段传播过程的数学表达式如下式（3）：According to the results of the second phase of iterative propagation Perform upsampling to double the length and width of the original. .according to and the third affinity map of corresponding size The depth value is iteratively propagated. The mathematical expression of this third stage propagation process is as follows (3):

（3） (3)

经过6次的迭代过程，最终得到，最终的即为本方法进行补全后的稠密深度图。After 6 iterations, we finally get , the final This is the dense depth map after completion by this method.

一种可行的实施方式中，对本发明所述方法进行实验验证，采用的数据集为NYU-Depth V2-纽约大学各种室内场景数据集以及深度补全数据集（Konstruktion Innovationund Technologie in der Industrie，KITTI）上进行的。使用的评测指标主要有如下六个，其数学表达式如下式（4）所示：In a feasible implementation, the method of the present invention is experimentally verified on the datasets of NYU-Depth V2-New York University various indoor scene datasets and depth completion dataset (Konstruktion Innovationund Technologie in der Industrie, KITTI). The evaluation indicators used are mainly the following six, and their mathematical expressions are shown in the following formula (4):

（4） (4)

其中，代表了预测的深度信息图结果，代表了该场景实际的深度信息结果。指标的t取值分别有1，2，3三种。上述指标中，均方根误差（Root Mean Square Error，RMSE）、指数平均均方根误差（Index Root Mean Square Error，iRMSE）、平均绝对误差（Mean Absolute Error，MAE）、指数平均绝对误差（Index Mean Absolute Error，iMAE）、相对误差（Relative Error，REL）都是对于误差的衡量指标，需要越小越好；代表着在指定误差范围内的像素点的数量，越大越好。in, Represents the predicted depth information graph result, Represents the actual depth information result of the scene. The t value of the indicator is 1, 2, and 3. Among the above indicators, Root Mean Square Error (RMSE), Index Root Mean Square Error (iRMSE), Mean Absolute Error (MAE), Index Mean Absolute Error (iMAE), and Relative Error (REL) are all error measurement indicators, and the smaller they are, the better; Represents the number of pixels within the specified error range, the larger the better.

针对现有技术中基于空间传播的方法，将其改造成解耦框架并进行解耦前后的精度对比。采用NYUv2数据集上的误差对比结果如表1（解耦结果误差对比表）所示。For the existing method based on spatial propagation, it is transformed into a decoupled framework and the accuracy before and after decoupling is compared. The error comparison results on the NYUv2 dataset are shown in Table 1 (Decoupling result error comparison table).

表1Table 1

根据表1可知现有技术中基于空间传播的方法都在解耦合后获得了提升，验证了本发明根据解耦合提升深度图精度的有效性。According to Table 1, the methods based on spatial propagation in the prior art are all improved after decoupling, which verifies the effectiveness of the present invention in improving the accuracy of the depth map based on decoupling.

通过改变预填充方法来测试本发明对于不同预填充方法的适应性，表2（不同预填充方法结果误差对比表）为使用不同预填充方法后在NYU数据集上的测试结果。The adaptability of the present invention to different pre-filling methods is tested by changing the pre-filling method. Table 2 (Comparison table of results errors of different pre-filling methods) shows the test results on the NYU dataset after using different pre-filling methods.

表2Table 2

根据表2结果可知本发明在采用的预填充方法上都获得了效果提升，体现了本发明作为具备通用的深度补全框架的可行性。According to the results in Table 2, it can be seen that the present invention has achieved improved results in all pre-filling methods adopted, which reflects the feasibility of the present invention as a universal depth completion framework.

采用相同的ResUnet34框架下对比目前存在的几种基于空间传播的方法的精度和计算量。表3（基于空间传播方法精度和算量对比表）是各种基于空间传播方法在NYUv2数据集上的效果对比图。Using the same ResUnet34 framework, we compare the accuracy and computational complexity of several existing methods based on spatial propagation. Table 3 (Comparison of accuracy and computational complexity of methods based on spatial propagation) is a comparison chart of the effects of various methods based on spatial propagation on the NYUv2 dataset.

表3Table 3

根据表3结果可知本发明方法在计算量较小的情况下获得了较大的精度提升。说明了本发明在设计的多层次亲和力生成的网络结构的优越性。According to the results in Table 3, it can be seen that the method of the present invention achieves a significant improvement in accuracy with a small amount of calculation, which illustrates the superiority of the present invention in the designed multi-level affinity generated network structure.

图2是根据一示例性实施例示出的一种基于深度信息传播的通用深度图补全装置框图，该装置用于基于深度信息传播的通用深度图补全方法。参照图2，该装置包括数据采集模块210、预填充模块220以及亲和力图生成模块230、深度图补全模块240。其中：FIG2 is a block diagram of a general depth map completion device based on depth information propagation according to an exemplary embodiment, and the device is used for a general depth map completion method based on depth information propagation. Referring to FIG2 , the device includes a data acquisition module 210, a pre-filling module 220, an affinity map generation module 230, and a depth map completion module 240. Among them:

数据采集模块210，用于使用深度传感器对场景进行数据采集，获得稀疏深度图；使用彩色相机对场景进行数据采集，获得RGB图；The data acquisition module 210 is used to acquire data of the scene using a depth sensor to obtain a sparse depth map; and to acquire data of the scene using a color camera to obtain an RGB map;

预填充模块220，用于采用预填充方法，对稀疏深度图进行深度填充，获得稠密深度图；A pre-filling module 220 is used to perform depth filling on the sparse depth map by using a pre-filling method to obtain a dense depth map;

亲和力图生成模块230，用于将稀疏深度图、RGB图和稠密深度图，输入ResUNeT网络进行特征提取，获得亲和力图；An affinity map generation module 230 is used to input the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map;

深度图补全模块240，用于根据稠密深度图以及亲和力图进行迭代传播，获得补全深度图。The depth map completion module 240 is used to perform iterative propagation according to the dense depth map and the affinity map to obtain a completed depth map.

可选地，亲和力图生成模块230，进一步用于：Optionally, the affinity graph generating module 230 is further configured to:

可选地，深度图补全模块240，进一步用于：Optionally, the depth map completion module 240 is further configured to:

本发明提出一种基于深度信息传播的通用深度图补全方法，通过可替换的预填充方法，使本发明提出方法能够在任意预填充方法上获得精度提升，预填充方法精度越高，最终精良的精度也越高。因此，本发明提出方法即使在后续新的深度补全方法出现之后，也可以把它作为预填充方法并再其的基础上获得精度更高的深度补全结果。本发明提出一种新的亲和力图生成方法，根据此方法生成的亲和力图解决了传播范围较大，导致推理速度过慢的问题。由于生成三个阶段的亲和力分别用的是对应尺度的特征图，这三个尺度的特征图则分别对应着从抽象的结构信息到具体的细节信息。本发明整体的空间传播网络框架，大大减小了模型训练得到的网络泛化性差的风险；由于监督信息的参与，降低学习训练过程的难度。本发明是一种克服深度传感器分辨率不足的补全精度高、推理速度快的深度图补全方法。The present invention proposes a universal depth map completion method based on deep information propagation. Through a replaceable pre-filling method, the method proposed by the present invention can obtain accuracy improvement on any pre-filling method. The higher the accuracy of the pre-filling method, the higher the final precision. Therefore, even after the emergence of a new depth completion method, the method proposed by the present invention can be used as a pre-filling method and obtain a more accurate depth completion result on its basis. The present invention proposes a new affinity map generation method. The affinity map generated by this method solves the problem that the propagation range is large and the reasoning speed is too slow. Since the affinity of the three stages is generated using feature maps of corresponding scales, the feature maps of these three scales correspond to information from abstract structural information to specific detail information. The overall spatial propagation network framework of the present invention greatly reduces the risk of poor generalization of the network obtained by model training; due to the participation of supervisory information, the difficulty of the learning and training process is reduced. The present invention is a depth map completion method with high completion accuracy and fast reasoning speed that overcomes the insufficient resolution of the depth sensor.

图3是本发明实施例提供的一种通用深度图补全设备的结构示意图，如图3所示，通用深度图补全设备可以包括上述图2所示的基于深度信息传播的通用深度图补全装置。可选地，通用深度图补全设备310可以包括处理器2001。FIG3 is a schematic diagram of the structure of a general depth map completion device provided by an embodiment of the present invention. As shown in FIG3 , the general depth map completion device may include the general depth map completion apparatus based on depth information propagation shown in FIG2 . Optionally, the general depth map completion device 310 may include a processor 2001 .

可选地，通用深度图补全设备310还可以包括存储器2002和收发器2003。Optionally, the general depth map completion device 310 may further include a memory 2002 and a transceiver 2003 .

其中，处理器2001与存储器2002以及收发器2003，如可以通过通信总线连接。The processor 2001, the memory 2002 and the transceiver 2003 may be connected via a communication bus.

下面结合图3对通用深度图补全设备310的各个构成部件进行具体的介绍：The following is a detailed introduction to the various components of the general depth map completion device 310 in conjunction with FIG3 :

其中，处理器2001是通用深度图补全设备310的控制中心，可以是一个处理器，也可以是多个处理元件的统称。例如，处理器2001是一个或多个中央处理器（centralprocessing unit，CPU），也可以是特定集成电路（application specific integratedcircuit，ASIC），或者是被配置成实施本发明实施例的一个或多个集成电路，例如：一个或多个微处理器（digital signal processor，DSP），或，一个或者多个现场可编程门阵列（field programmable gate array，FPGA）。The processor 2001 is the control center of the general depth map completion device 310, and may be a processor or a general term for multiple processing elements. For example, the processor 2001 is one or more central processing units (CPUs), or may be application specific integrated circuits (ASICs), or may be configured to implement one or more integrated circuits of the embodiments of the present invention, such as one or more microprocessors (digital signal processors, DSPs), or one or more field programmable gate arrays (field programmable gate arrays, FPGAs).

可选地，处理器2001可以通过运行或执行存储在存储器2002内的软件程序，以及调用存储在存储器2002内的数据，执行通用深度图补全设备310的各种功能。Optionally, the processor 2001 may execute various functions of the general depth map completion device 310 by running or executing a software program stored in the memory 2002 and calling data stored in the memory 2002 .

在具体的实现中，作为一种实施例，处理器2001可以包括一个或多个CPU，例如图3中所示出的CPU0和CPU1。In a specific implementation, as an embodiment, the processor 2001 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 3 .

在具体实现中，作为一种实施例，通用深度图补全设备310也可以包括多个处理器，例如图3中所示的处理器2001和处理器2004。这些处理器中的每一个可以是一个单核处理器（single-CPU），也可以是一个多核处理器（multi-CPU）。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据（例如计算机程序指令）的处理核。In a specific implementation, as an embodiment, the general depth map completion device 310 may also include multiple processors, such as the processor 2001 and the processor 2004 shown in FIG3 . Each of these processors may be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). The processor here may refer to one or more devices, circuits, and/or processing cores for processing data (such as computer program instructions).

其中，所述存储器2002用于存储执行本发明方案的软件程序，并由处理器2001来控制执行，具体实现方式可以参考上述方法实施例，此处不再赘述。The memory 2002 is used to store the software program for executing the solution of the present invention, and the execution is controlled by the processor 2001. The specific implementation method can refer to the above method embodiment, which will not be repeated here.

可选地，存储器2002可以是只读存储器（read-only memory，ROM）或可存储静态信息和指令的其他类型的静态存储设备，随机存取存储器（random access memory，RAM）或者可存储信息和指令的其他类型的动态存储设备，也可以是电可擦可编程只读存储器（electrically erasable programmable read-only memory，EEPROM）、只读光盘（compactdisc read-only memory，CD-ROM）或其他光盘存储、光碟存储（包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等）、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质，但不限于此。存储器2002可以和处理器2001集成在一起，也可以独立存在，并通过通用深度图补全设备310的接口电路（图3中未示出）与处理器2001耦合，本发明实施例对此不作具体限定。Optionally, the memory 2002 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM) or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical disc, laser disc, optical disc, digital versatile disc, Blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of an instruction or data structure and can be accessed by a computer, but is not limited thereto. The memory 2002 may be integrated with the processor 2001, or may exist independently, and be coupled to the processor 2001 through an interface circuit (not shown in FIG. 3 ) of the universal depth map completion device 310, which is not specifically limited in the embodiment of the present invention.

收发器2003，用于与网络设备通信，或者与终端设备通信。The transceiver 2003 is used to communicate with a network device or a terminal device.

可选地，收发器2003可以包括接收器和发送器（图3中未单独示出）。其中，接收器用于实现接收功能，发送器用于实现发送功能。Optionally, the transceiver 2003 may include a receiver and a transmitter (not shown separately in FIG. 3 ), wherein the receiver is used to implement a receiving function, and the transmitter is used to implement a sending function.

可选地，收发器2003可以和处理器2001集成在一起，也可以独立存在，并通过通用深度图补全设备310的接口电路（图3中未示出）与处理器2001耦合，本发明实施例对此不作具体限定。Optionally, the transceiver 2003 may be integrated with the processor 2001 or exist independently and be coupled to the processor 2001 via an interface circuit (not shown in FIG. 3 ) of the universal depth map completion device 310 , which is not specifically limited in this embodiment of the present invention.

需要说明的是，图3中示出的通用深度图补全设备310的结构并不构成对该路由器的限定，实际的知识结构识别设备可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。It should be noted that the structure of the general depth map completion device 310 shown in FIG. 3 does not constitute a limitation on the router, and the actual knowledge structure recognition device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently.

此外，通用深度图补全设备310的技术效果可以参考上述方法实施例所述的基于深度信息传播的通用深度图补全方法的技术效果，此处不再赘述。In addition, the technical effects of the general depth map completion device 310 can refer to the technical effects of the general depth map completion method based on depth information propagation described in the above method embodiment, which will not be repeated here.

应理解，在本发明实施例中的处理器2001可以是中央处理单元（centralprocessing unit，CPU），该处理器还可以是其他通用处理器、数字信号处理器（digitalsignal processor，DSP）、专用集成电路（application specific integrated circuit，ASIC）、现成可编程门阵列（field programmable gate array，FPGA）或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that the processor 2001 in the embodiment of the present invention may be a central processing unit (CPU), and the processor may also be other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc.

还应理解，本发明实施例中的存储器可以是易失性存储器或非易失性存储器，或可包括易失性和非易失性存储器两者。其中，非易失性存储器可以是只读存储器（read-only memory，ROM）、可编程只读存储器（programmable ROM，PROM）、可擦除可编程只读存储器（erasable PROM，EPROM）、电可擦除可编程只读存储器（electrically EPROM，EEPROM）或闪存。易失性存储器可以是随机存取存储器（random access memory，RAM），其用作外部高速缓存。通过示例性但不是限制性说明，许多形式的随机存取存储器（random accessmemory，RAM）可用，例如静态随机存取存储器（static RAM，SRAM）、动态随机存取存储器（DRAM）、同步动态随机存取存储器（synchronous DRAM，SDRAM）、双倍数据速率同步动态随机存取存储器（double data rate SDRAM，DDR SDRAM）、增强型同步动态随机存取存储器（enhanced SDRAM，ESDRAM）、同步连接动态随机存取存储器（synchlink DRAM，SLDRAM）和直接内存总线随机存取存储器（direct rambus RAM，DR RAM）。It should also be understood that the memory in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories. Among them, the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory may be a random access memory (RAM), which is used as an external cache. By way of example and not limitation, many forms of random access memory (RAM) are available, such as static RAM (SRAM), dynamic random access memory (DRAM), synchronous DRAM (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link DRAM (SLDRAM), and direct rambus RAM (DR RAM).

上述实施例，可以全部或部分地通过软件、硬件（如电路）、固件或其他任意组合来实现。当使用软件实现时，上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时，全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线（例如红外、无线、微波等）方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质（例如，软盘、硬盘、磁带）、光介质（例如，DVD）、或者半导体介质。半导体介质可以是固态硬盘。The above embodiments can be implemented in whole or in part by software, hardware (such as circuits), firmware or any other combination. When implemented by software, the above embodiments can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, the process or function described in the embodiment of the present invention is generated in whole or in part. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions can be transmitted from one website, computer, server or data center to another website, computer, server or data center by wired (such as infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server or data center that contains one or more available media sets. The available medium can be a magnetic medium (for example, a floppy disk, a hard disk, a tape), an optical medium (for example, a DVD), or a semiconductor medium. The semiconductor medium can be a solid-state hard disk.

应理解，本文中术语“和/或”，仅仅是一种描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况，其中A,B可以是单数或者复数。另外，本文中字符“/”，一般表示前后关联对象是一种“或”的关系，但也可能表示的是一种“和/或”的关系，具体可参考前后文进行理解。It should be understood that the term "and/or" in this article is only a description of the association relationship of associated objects, indicating that there can be three relationships. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. A and B can be singular or plural. In addition, the character "/" in this article generally indicates that the associated objects before and after are in an "or" relationship, but it may also indicate an "and/or" relationship. Please refer to the context for specific understanding.

本发明中，“至少一个”是指一个或者多个，“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达，是指的这些项中的任意组合，包括单项（个）或复数项（个）的任意组合。例如，a,b,或c中的至少一项（个），可以表示：a, b, c, a-b, a-c, b-c, 或a-b-c，其中a,b,c可以是单个，也可以是多个。In the present invention, "at least one" means one or more, and "plurality" means two or more. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single items or plural items. For example, at least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, and c can be single or multiple.

应理解，在本发明的各种实施例中，上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本发明实施例的实施过程构成任何限定。It should be understood that in various embodiments of the present invention, the size of the serial numbers of the above-mentioned processes does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present invention.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的设备、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the above-described equipment, devices and units can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.

在本发明所提供的几个实施例中，应该理解到，所揭露的设备、装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个设备，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed devices, apparatuses and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another device, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备（可以是个人计算机，服务器，或者网络设备等）执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器（read-only memory，ROM）、随机存取存储器（random access memory，RAM）、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., various media that can store program codes.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以所述权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art who is familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed by the present invention, which should be included in the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. A general depth map completion method based on depth information propagation, characterized in that the method comprises:

Use a depth sensor to collect data on the scene and obtain a sparse depth map; use a color camera to collect data on the scene and obtain an RGB map;

Using a pre-filling method, depth-filling the sparse depth map to obtain a dense depth map;

Inputting the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map;

Wherein, the affinity map includes a first affinity map, a second affinity map and a third affinity map;

The first affinity map is used for structural information completion; the size of the first affinity map is one sixteenth of the size of the dense depth map; the size of the dense depth map is , m is the length of the dense depth map, n is the width of the dense depth map;

The second affinity map is used for detail information completion; the size of the second affinity map is one quarter of the size of the dense depth map;

The third affinity map is used for completing detail information; the size of the third affinity map is the size of the dense depth map;

Wherein, the ResUNeT network includes a feature extraction branch and an affinity graph generation branch;

The feature extraction branch is an encoder-decoder structure; the encoder structure includes 5 convolutional layers; the decoder structure includes 4 deconvolutional layers;

The affinity graph generation branch includes a first affinity graph generation branch, a second affinity graph generation branch and a third affinity graph generation branch; the affinity graph generation branch includes 2 deconvolution layers and 1 convolution layer;

The step of inputting the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map includes:

According to the sparse depth map, the RGB map and the dense depth map, feature extraction is performed through a ResUNeT network to obtain a first image feature, a second image feature and a third image feature;

The first image feature includes a first sparse depth map feature, a first RGB map feature, and a first dense depth map feature; a size of the first image feature is one sixteenth of a size of the dense depth map;

The second image feature includes a second sparse depth map feature, a second RGB map feature, and a second dense depth map feature; the size of the second image feature is one quarter of the size of the dense depth map;

The third image feature includes a third sparse depth map feature, a third RGB map feature and a third dense depth map feature; the size of the third image feature is the size of the dense depth map;

Performing a convolution operation according to the first image feature to obtain a first affinity map;

Performing a convolution operation according to the second image feature to obtain a second affinity map;

Performing a convolution operation according to the third image feature to obtain a third affinity map;

Iterative propagation is performed according to the dense depth map and the affinity map to obtain a completed depth map.

2. According to claim 1, the general depth map completion method based on depth information propagation is characterized in that the pre-filling method is a convolutional spatial propagation network, a non-local spatial propagation network, a dense spatial propagation network or a fully convolutional spatial propagation network.

3. The general depth map completion method based on depth information propagation according to claim 1, characterized in that the iterative propagation according to the dense depth map and the affinity map to obtain the completed depth map comprises:

Downsampling the dense depth map to obtain a first dense depth map; iteratively propagating the first dense depth map based on the first affinity map to obtain a first completed depth map;

Performing upsampling processing on the first completed depth map to obtain a second dense depth map; iteratively propagating the second dense depth map based on the second affinity map to obtain a second completed depth map;

The second completed depth map is upsampled to obtain a third dense depth map; based on the third affinity map, the third dense depth map is iteratively propagated to obtain a completed depth map.

4. A general depth map completion device based on depth information propagation, wherein the general depth map completion device based on depth information propagation is used to implement the general depth map completion method based on depth information propagation according to any one of claims 1 to 3, characterized in that the device comprises:

The data acquisition module is used to collect data of the scene using a depth sensor to obtain a sparse depth map; and to collect data of the scene using a color camera to obtain an RGB map;

A pre-filling module, configured to perform depth filling on the sparse depth map by using a pre-filling method to obtain a dense depth map;

An affinity map generation module, used for inputting the sparse depth map, the RGB map and the dense depth map into the ResUNeT network for feature extraction to obtain an affinity map;

Wherein, the affinity graph generation module is further used for:

The depth map completion module is used to perform iterative propagation according to the dense depth map and the affinity map to obtain a completed depth map.

5. The general depth map completion device based on depth information propagation according to claim 4, characterized in that the depth map completion module is further used to:

6. A general depth map completion device, characterized in that the general depth map completion device comprises:

processor;

A memory having computer-readable instructions stored thereon, wherein when the computer-readable instructions are executed by the processor, the method according to any one of claims 1 to 3 is implemented.

7. A computer-readable storage medium, characterized in that program codes are stored in the computer-readable storage medium, and the program codes can be called by a processor to execute the method according to any one of claims 1 to 3.