[go: up one dir, main page]

CN109919825B - An ORB-SLAM Hardware Accelerator - Google Patents

An ORB-SLAM Hardware Accelerator Download PDF

Info

Publication number
CN109919825B
CN109919825B CN201910084078.9A CN201910084078A CN109919825B CN 109919825 B CN109919825 B CN 109919825B CN 201910084078 A CN201910084078 A CN 201910084078A CN 109919825 B CN109919825 B CN 109919825B
Authority
CN
China
Prior art keywords
cache
orb
unit
module
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910084078.9A
Other languages
Chinese (zh)
Other versions
CN109919825A (en
Inventor
杨建磊
刘润泽
赵巍胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201910084078.9A priority Critical patent/CN109919825B/en
Publication of CN109919825A publication Critical patent/CN109919825A/en
Application granted granted Critical
Publication of CN109919825B publication Critical patent/CN109919825B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开一种ORB‑SLAM硬件加速器,包括FPGA硬件加速模块,用于对特征提取和特征匹配进行加速;传感器模块,用于捕获图像;处理器系统,作为主机控制FPGA硬件加速模块和传感器模块,并负责运行位姿估计、位姿优化以及地图更新。本发明利用FPGA硬件加速模块对ORB‑SLAM流程中计算量最大、耗时最多的过程进行加速,能够有效提高ORB‑SLAM的运行速度并降低功耗,大幅提高能耗比,降低ORB‑SLAM在功耗受限的平台上部署的难度。

Figure 201910084078

The invention discloses an ORB-SLAM hardware accelerator, comprising an FPGA hardware acceleration module for accelerating feature extraction and feature matching; a sensor module for capturing images; a processor system for controlling the FPGA hardware acceleration module and the sensor module as a host , and is responsible for running pose estimation, pose optimization, and map updates. The invention uses the FPGA hardware acceleration module to accelerate the process with the largest amount of calculation and the most time-consuming in the ORB-SLAM process, which can effectively improve the running speed of the ORB-SLAM and reduce the power consumption, greatly improve the energy consumption ratio, and reduce the ORB-SLAM in the process. Difficulty deploying on power-constrained platforms.

Figure 201910084078

Description

一种ORB-SLAM硬件加速器An ORB-SLAM Hardware Accelerator

技术领域technical field

本发明涉及自主导航领域,具体涉及一种ORB-SLAM硬件加速器。The invention relates to the field of autonomous navigation, in particular to an ORB-SLAM hardware accelerator.

背景技术Background technique

SLAM(同时定位与建图)是自主导航领域中最关键的技术之一,它使自主导航系统能够在一个未知环境中,依据传感器捕获的信息对周围环境进行增量式的地图构建,与此同时确定自身在环境中的位置。SLAM广泛应用于自动驾驶汽车、自主导航机器人、虚拟现实以及增强现实等领域,是一项至关重要的技术。SLAM (simultaneous localization and mapping) is one of the most critical technologies in the field of autonomous navigation. It enables the autonomous navigation system to build an incremental map of the surrounding environment based on the information captured by sensors in an unknown environment. At the same time determine its own position in the environment. SLAM is a critical technology widely used in autonomous vehicles, autonomous navigation robots, virtual reality, and augmented reality.

ORB-SLAM是一种基于ORB描述子的特征点法视觉SLAM。它是一种十分高效且鲁棒的SLAM系统,受到了广泛的研究关注。现有的ORB-SLAM系统只能在传统计算平台(CPU、GPU)上运行,这导致它们的性能功耗比相对较低。受计算平台的性能、功耗限制,ORB-SLAM的低功耗实时运行一直是一个难以解决的问题。ORB-SLAM在特征提取、匹配的过程中消耗大量的计算资源,为了保证实时运行往往需要使用高性能的CPU和GPU,这带来了大量的功耗开销。然而如果为了降低功耗而使用嵌入式CPU、GPU,则会导致帧数过低无法实时运行。ORB-SLAM is a feature point-based visual SLAM based on ORB descriptors. It is a very efficient and robust SLAM system that has received extensive research attention. Existing ORB-SLAM systems can only run on traditional computing platforms (CPU, GPU), which results in their relatively low performance-to-power ratio. Limited by the performance and power consumption of the computing platform, the low-power real-time operation of ORB-SLAM has always been a difficult problem to solve. ORB-SLAM consumes a lot of computing resources in the process of feature extraction and matching. In order to ensure real-time operation, high-performance CPU and GPU are often required, which brings a lot of power consumption overhead. However, if an embedded CPU or GPU is used in order to reduce power consumption, the frame rate will be too low to run in real time.

发明内容SUMMARY OF THE INVENTION

本发明提供一种ORB-SLAM硬件加速器,用于解决ORB-SLAM难以低功耗实时运行的问题。The present invention provides an ORB-SLAM hardware accelerator, which is used to solve the problem that the ORB-SLAM is difficult to run in real time with low power consumption.

一种ORB-SLAM硬件加速器,其包含传感器模块、FPGA硬件加速模块和处理器系统;An ORB-SLAM hardware accelerator, which includes a sensor module, an FPGA hardware acceleration module and a processor system;

所述传感器模块用于采集图像数据;the sensor module is used for collecting image data;

所述FPGA硬件加速模块用于在传感器模块采集的图像上提取ORB特征,以及将提取的特征与全局地图中的地图点进行匹配;The FPGA hardware acceleration module is used for extracting ORB features on the images collected by the sensor module, and matching the extracted features with map points in the global map;

所述处理器系统用于根据FPGA硬件加速模块提取的ORB特征以及ORB特征与全局地图中的地图点的匹配结果来计算相机位姿并维护全局地图。The processor system is configured to calculate the camera pose and maintain the global map according to the ORB feature extracted by the FPGA hardware acceleration module and the matching result between the ORB feature and the map points in the global map.

优选的,所述FPGA硬件加速模块包含:图像降采样模块、特征提取模块以及特征匹配模块;Preferably, the FPGA hardware acceleration module includes: an image downsampling module, a feature extraction module and a feature matching module;

所述图像降采样模块用于生成图像金字塔;The image downsampling module is used to generate an image pyramid;

所述特征提取模块用于在所述图像金字塔的每一层上提取ORB特征;The feature extraction module is used to extract ORB features on each layer of the image pyramid;

所述特征匹配模块用于将特征提取模块提取的ORB特征与全局地图中的地图点进行匹配。The feature matching module is used to match the ORB features extracted by the feature extraction module with map points in the global map.

优选的,所述特征提取模块包含:关键点提取单元、非极大值抑制单元、高斯滤波单元、特征方向计算单元、描述子计算单元、第一缓存单元;Preferably, the feature extraction module includes: a key point extraction unit, a non-maximum value suppression unit, a Gaussian filter unit, a feature direction calculation unit, a descriptor calculation unit, and a first cache unit;

所述关键点提取单元用于在输入图像上提取FAST角点并计算FAST角点的Harris响应值;The key point extraction unit is used to extract the FAST corner point on the input image and calculate the Harris response value of the FAST corner point;

所述非极大值抑制单元用于根据FAST角点的Harris响应值对其进行非极大值抑制,在邻域中保留Harris响应值最大的一个FAST角点;The non-maximum suppression unit is used to perform non-maximum suppression on the Harris response value of the FAST corner point, and retain a FAST corner point with the largest Harris response value in the neighborhood;

所述高斯滤波单元用于对输入图像进行高斯模糊处理,去除图像中的噪声,生成平滑图像;The Gaussian filtering unit is used to perform Gaussian blurring on the input image, remove noise in the image, and generate a smooth image;

所述特征方向计算单元用于在经过高斯滤波后的图像上计算特征方向;The feature direction calculation unit is used to calculate the feature direction on the Gaussian filtered image;

所述描述子计算单元用于根据特征方向在经过高斯滤波后的图像上计算特征的BRIEF描述子;The descriptor calculating unit is used to calculate the Brief descriptor of the feature on the Gaussian filtered image according to the feature direction;

所述第一缓存单元用于暂存输入数据、中间计算结果以及最终结果。The first cache unit is used to temporarily store input data, intermediate calculation results and final results.

优选的,所述特征方向计算单元的工作流程如下:首先,计算特征所在邻域的像素灰度质心坐标;随后,计算灰度质心的横纵坐标比值,并依据查找表获得特征的方向。Preferably, the work flow of the feature direction calculation unit is as follows: first, calculate the pixel gray centroid coordinates of the neighborhood where the feature is located; then, calculate the ratio of the horizontal and vertical coordinates of the gray centroid, and obtain the feature direction according to the lookup table.

优选的,所述第一缓存单元包含:输入图像缓存、平滑图像缓存、响应值缓存以及特征缓存。其中,输入图像缓存、平滑图像缓存以及响应值缓存采用类乒乓架构,由多块相同的缓存组成,能够同时处理数据的输入及输出。特征缓存采用最大堆架构,用于在保存特征的同时对特征进行筛选,仅保留Harris响应值较大的部分特征。Preferably, the first cache unit includes: an input image cache, a smooth image cache, a response value cache, and a feature cache. Among them, the input image cache, the smooth image cache and the response value cache adopt a ping-pong-like architecture, which consists of multiple identical caches, which can process the input and output of data at the same time. The feature cache adopts a max-heap architecture, which is used to filter features while saving features, and only retain some features with large Harris response values.

优选的,所述特征提取模块的各个单元采用流式计算架构,所有计算单元并行运行,且缓存内仅保存部分数据,数据在使用完毕后立刻丢弃。Preferably, each unit of the feature extraction module adopts a streaming computing architecture, all computing units run in parallel, and only part of the data is stored in the cache, and the data is discarded immediately after use.

优选的,所述特征匹配模块包含:匹配单元和第二缓存单元;Preferably, the feature matching module includes: a matching unit and a second cache unit;

所述匹配单元用于对特征提取模块提取的特征与全局地图中的地图点进行匹配。其工作流程如下:首先,计算两组描述子(全局地图中地图点的描述子以及当从前帧中提取的特征的描述子)中任意两个描述子之间的汉明距离;之后,使用暴力搜索方法根据汉明距离对两组描述子进行匹配;The matching unit is used for matching the features extracted by the feature extraction module with map points in the global map. The workflow is as follows: First, calculate the Hamming distance between any two descriptors in the two sets of descriptors (descriptors of map points in the global map and descriptors of features currently extracted from the previous frame); after that, use brute force The search method matches two sets of descriptors according to the Hamming distance;

所述第二缓存单元包含描述子缓存及结果缓存,分别用于暂存两组待匹配描述子和匹配结果。The second cache unit includes a descriptor cache and a result cache, which are respectively used to temporarily store two sets of descriptors to be matched and matching results.

所述处理器系统包含通用处理器、内存和内存控制器。所述通用处理器用于根据提取的图像特征以及特征与全局地图中地图点的匹配关系来计算相机位姿,并且更新、维护全局地图。The processor system includes a general purpose processor, memory and a memory controller. The general-purpose processor is used to calculate the camera pose according to the extracted image features and the matching relationship between the features and map points in the global map, and to update and maintain the global map.

所述FPGA硬件加速模块和处理器系统通过AXI总线通信。处理器系统中的通用处理器可以通过AXI总线直接配置FPGA硬件加速模块中的指令寄存器;FPGA硬件加速模块也可以通过AXI总线直接从处理器系统中的内存读取数据,或将计算结果存入内存。The FPGA hardware acceleration module and the processor system communicate through the AXI bus. The general-purpose processor in the processor system can directly configure the instruction registers in the FPGA hardware acceleration module through the AXI bus; the FPGA hardware acceleration module can also directly read data from the memory in the processor system through the AXI bus, or store the calculation results in the FPGA hardware acceleration module. Memory.

所述FPGA硬件加速模块和处理器系统以流水线的方式并行运行。The FPGA hardware acceleration module and the processor system run in parallel in a pipeline manner.

本发明一种ORB-SLAM硬件加速器,其优点及功效在于:利用专用硬件加速模块对ORB-SLAM中计算最为密集的过程进行加速,在提高整个系统的帧率的同时降低功耗,使ORB-SLAM的低功耗实时运行成为可能。An ORB-SLAM hardware accelerator of the present invention has the advantages and efficacy of: using a dedicated hardware acceleration module to accelerate the most intensive calculation process in ORB-SLAM, reducing power consumption while increasing the frame rate of the entire system, enabling ORB-SLAM Low-power real-time operation of SLAM becomes possible.

附图说明Description of drawings

图1是本发明一个实施例的一种ORB-SLAM硬件加速器的结构示意图。FIG. 1 is a schematic structural diagram of an ORB-SLAM hardware accelerator according to an embodiment of the present invention.

图2是本发明一个实施例的一种ORB-SLAM硬件加速器的特征提取模块的结构示意图。FIG. 2 is a schematic structural diagram of a feature extraction module of an ORB-SLAM hardware accelerator according to an embodiment of the present invention.

图3是本发明一个实施例的一种ORB-SLAM硬件加速器的特征匹配模块的结构示意图。FIG. 3 is a schematic structural diagram of a feature matching module of an ORB-SLAM hardware accelerator according to an embodiment of the present invention.

图4是本发明一个实施例的一种ORB-SLAM硬件加速器的FPGA硬件加速模块与处理器系统并行运行时的流水线示意图。4 is a schematic diagram of a pipeline when an FPGA hardware acceleration module of an ORB-SLAM hardware accelerator and a processor system run in parallel according to an embodiment of the present invention.

具体实施方式Detailed ways

本发明针对现有的ORB-SLAM难以在低功耗平台上实时运行的问题,提出一种ORB-SLAM硬件加速器。该加速模块通过专用的硬件加速模块对ORB-SLAM流程中计算量最大的过程进行加速,提高系统整体的性能功耗比,使低功耗实时运行成为可能。接下来通过结合具体实施例与附图对本发明进行进一步详细说明。Aiming at the problem that the existing ORB-SLAM is difficult to run in real time on a low power consumption platform, the present invention proposes an ORB-SLAM hardware accelerator. The acceleration module accelerates the most computationally intensive process in the ORB-SLAM process through a dedicated hardware acceleration module, improves the overall performance-to-power ratio of the system, and enables low-power real-time operation. Next, the present invention will be further described in detail with reference to specific embodiments and accompanying drawings.

图1是本发明一个实施例的一种ORB-SLAM硬件加速器的结构示意图。如图1所示,本加速器由三部分组成:FPGA硬件加速模块、处理器系统和传感器模块。其中处理器系统为主机,传感器模块通过通用串行总线与处理器系统相连,FPGA硬件加速模块通过AXI总线与处理器系统相连。处理器系统包含通用处理器和内存;FPGA硬件加速模块包含图像降采样模块、特征提取模块和特征匹配模块。FIG. 1 is a schematic structural diagram of an ORB-SLAM hardware accelerator according to an embodiment of the present invention. As shown in Figure 1, this accelerator consists of three parts: FPGA hardware acceleration module, processor system and sensor module. The processor system is the host computer, the sensor module is connected to the processor system through the universal serial bus, and the FPGA hardware acceleration module is connected to the processor system through the AXI bus. The processor system includes a general-purpose processor and memory; the FPGA hardware acceleration module includes an image downsampling module, a feature extraction module and a feature matching module.

传感器模块以一定频率捕获图像,每当一幅图像被捕获时,它被传送至处理器系统中的内存中进行暂存。接着,通用处理器通知特征提取模块开始运行。收到通知后,特征提取模块在没有通用处理器参与的情况下发起数据传输,将内存中储存的图片存入输入图像缓存,并开始在图像上提取特征。与此同时,特征提取模块通知图像降采样模块对内存中存储的图片进行降采样,生成图像金字塔供进一步的特征提取使用。特征提取完毕后,特征提取模块将提取的特征存入内存和特征匹配模块的描述子缓存中,并通过中断通知通用处理器。随后,通用处理器通知特征匹配模块开始运行。特征匹配模块从内存中读取全局地图中的地图点,并与从特征提取模块获得的特征进行匹配,完毕后将结果存入内存并告知通用处理器。在这之后,通用处理器依次进行:位姿估计、位姿优化、关键帧判断。若为关键帧,则进行全局地图更新。The sensor module captures images at a certain frequency, and each time an image is captured, it is transferred to memory in the processor system for temporary storage. Next, the general-purpose processor notifies the feature extraction module to start running. After receiving the notification, the feature extraction module initiates data transfer without the participation of the general-purpose processor, stores the image stored in the memory into the input image cache, and begins to extract features on the image. At the same time, the feature extraction module notifies the image downsampling module to downsample the images stored in the memory to generate image pyramids for further feature extraction. After the feature extraction is completed, the feature extraction module stores the extracted features in the memory and the descriptor cache of the feature matching module, and notifies the general-purpose processor through an interrupt. Then, the general-purpose processor notifies the feature matching module to start running. The feature matching module reads the map points in the global map from the memory, matches the features obtained from the feature extraction module, stores the results in the memory and informs the general-purpose processor after completion. After that, the general-purpose processor performs the following steps: pose estimation, pose optimization, and key frame judgment. If it is a keyframe, the global map is updated.

图2是本发明一个实施例的一种ORB-SLAM硬件加速器的特征提取模块的结构示意图,其中NMS表示非极大值抑制单元。特征提取模块用于在输入的图像上提取ORB特征。它通过AXI接口从内存中读取图片,在图片上提取ORB特征,最后将结果通过AXI接口写回内存,并把特征的描述子发送给特征匹配模块。如图2所示,特征提取模块包含:FIG. 2 is a schematic structural diagram of a feature extraction module of an ORB-SLAM hardware accelerator according to an embodiment of the present invention, where NMS represents a non-maximum value suppression unit. The feature extraction module is used to extract ORB features on the input image. It reads the picture from the memory through the AXI interface, extracts the ORB feature on the picture, and finally writes the result back to the memory through the AXI interface, and sends the feature descriptor to the feature matching module. As shown in Figure 2, the feature extraction module includes:

AXI接口:用于与处理器系统进行通信。AXI接口包含一个主接口和一个从接口,从接口用于接收处理器系统中通用处理器的指令,主接口用于读写处理器系统中的内存。需要指出的是,特征提取模块在通过AXI主接口读写内存时,不需要通用处理器的参与,通用处理器可以同时处理其他任务。AXI interface: used to communicate with the processor system. The AXI interface includes a master interface and a slave interface. The slave interface is used to receive instructions from the general-purpose processor in the processor system, and the master interface is used to read and write the memory in the processor system. It should be pointed out that the feature extraction module does not need the participation of the general-purpose processor when reading and writing the memory through the AXI main interface, and the general-purpose processor can process other tasks at the same time.

关键点提取单元:用于在图像上提取FAST关键点并计算关键点的Harris响应值。关键点提取单元以一个7x7的像素区域作为输入,通过比较中心像素的灰度值与其周围半径为3的圆周上的像素的灰度值的大小来判断该像素是否为FAST关键点。并通过计算中心像素与圆周上像素的灰度值的差距来计算响应值。Keypoint extraction unit: used to extract FAST keypoints on the image and calculate the Harris response value of the keypoints. The key point extraction unit takes a 7x7 pixel area as input, and determines whether the pixel is a FAST key point by comparing the gray value of the central pixel with the gray value of the pixels on the circumference of the circle with a radius of 3. And calculate the response value by calculating the difference between the gray value of the center pixel and the pixel on the circumference.

高斯滤波单元:用于对图像进行高斯模糊。高斯滤波单元中保存一个7x7的高斯核,通过高斯核与输入图像像素的卷积生成平滑图像。Gaussian filtering unit: used to blur the image with Gaussian. A 7x7 Gaussian kernel is stored in the Gaussian filtering unit, and a smooth image is generated by the convolution of the Gaussian kernel and the pixels of the input image.

非极大值抑制单元:用于对关键点进行非极大值抑制。非极大值抑制单元依据关键点的Harris响应值对关键点进行过滤,在任意3x3的邻域内只保留响应值最大的一个关键点。Non-maximum suppression unit: used for non-maximum suppression of key points. The non-maximum suppression unit filters the key points according to the Harris response value of the key point, and only retains one key point with the largest response value in any 3x3 neighborhood.

特征方向计算单元:用于在平滑图像上计算特征方向。特征方向计算单元以特征位置周围半径15像素的圆形区域作为输入,首先计算该区域的灰度质心,然后以特征位置到灰度质心位置的向量作为特征方向。Feature Orientation Calculation Unit: Used to calculate feature orientations on smoothed images. The feature direction calculation unit takes a circular area with a radius of 15 pixels around the feature position as input, first calculates the gray centroid of the area, and then uses the vector from the feature position to the gray centroid position as the feature direction.

描述子计算单元:用于在平滑图像上计算特征的描述子。描述子计算单元中保存了256对测试位置,通过在特征周围31x31的邻域内对这256对测试位置的像素灰度值进行比较,得到一个256位的描述子。描述子计算单元在计算描述子时,会将测试位置旋转至与特征点的方向一致,以保证特征的旋转不变性。Descriptor Computation Unit: Descriptors for computing features on smooth images. 256 pairs of test positions are stored in the descriptor calculation unit, and a 256-bit descriptor is obtained by comparing the pixel gray values of these 256 pairs of test positions in a 31x31 neighborhood around the feature. When the descriptor calculation unit calculates the descriptor, the test position will be rotated to be consistent with the direction of the feature point to ensure the rotation invariance of the feature.

第一缓存单元:由输入图像缓存、平滑图像缓存、响应值缓存及特征缓存组成,其中:The first cache unit: consists of an input image cache, a smooth image cache, a response value cache and a feature cache, wherein:

输入图像缓存、平滑图像缓存、响应值缓存:分别用于对输入图像、平滑图像以及关键点的响应值进行缓存。这三个缓存区域采用了类似乒乓架构的设计,由多块相同的缓存组成。在一部分缓存由于输出数据被占用,无法输入的时候,其余的缓存可以接收输入数据。需要指出的是,由于特征提取模块采用流水线式计算架构,缓存中保存的数据在使用完毕后即可抛弃,因此这三块缓存中只需存储少量的数据。以输入图像缓存为例,输入图像缓存中只需保存图像的16行像素,而无需保存整个图片。Input image cache, smooth image cache, and response value cache: used to cache the input image, smooth image, and response values of key points, respectively. The three cache areas are designed with a ping-pong architecture and consist of multiple identical caches. When a part of the cache cannot be input due to the output data being occupied, the rest of the cache can receive input data. It should be pointed out that since the feature extraction module adopts a pipeline computing architecture, the data stored in the cache can be discarded after use, so only a small amount of data needs to be stored in the three caches. Taking the input image cache as an example, only 16 lines of pixels of the image need to be stored in the input image cache, and there is no need to store the entire image.

特征缓存:用于保存提取的特征并对特征进行筛选。特征缓存采用最大堆架构,在输入特征的同时,根据特征的响应值对特征进行堆排序,仅保留响应值较大的特征。Feature cache: used to save the extracted features and filter the features. The feature cache adopts the maximum heap architecture. While inputting features, the features are sorted according to their response values, and only the features with larger response values are retained.

特征提取模块在图片上提取ORB特征的过程如下:首先,图像的一部分像素通过AXI接口存入输入图像缓存。关键点提取单元和高斯滤波单元分别开始在这一部分像素上提取FAST关键点以及进行高斯模糊。关键点提取单元在提取FAST关键点的过程中,同时计算关键点的Harris响应值,并把响应值存入响应值缓存(需要指出的是,响应值缓存中存储的数据不仅表示了关键点的响应值,还表示了像素是否为关键点:若响应值为0则表示该像素不是关键点,否则是关键点)。高斯滤波单元将生成的平滑图像存入平滑图像缓存。然后,非极大值抑制(NMS)单元对提取的关键点进行非极大值抑制,特征方向计算单元在平滑图像上计算关键点的特征方向。接着,描述子计算单元根据特征方向计算特征的描述子,并把结果存入特征缓存。计算全部结束后,特征缓存中保存的结果被送回内存,并被送至特征匹配模块。需要指出的是,特征提取模块内的各个计算单元以及缓存在实际运行过程中并非串行运行,而是以流水线的方式并行运行。The process of extracting ORB features from the image by the feature extraction module is as follows: First, a part of the pixels of the image are stored in the input image buffer through the AXI interface. The key point extraction unit and the Gaussian filtering unit start to extract FAST key points and perform Gaussian blurring on this part of the pixels, respectively. In the process of extracting FAST key points, the key point extraction unit calculates the Harris response value of the key point at the same time, and stores the response value in the response value cache (it should be pointed out that the data stored in the response value cache not only represents the key point The response value also indicates whether the pixel is a key point: if the response value is 0, it means that the pixel is not a key point, otherwise it is a key point). The Gaussian filtering unit stores the generated smoothed image in the smoothed image buffer. Then, a non-maximum suppression (NMS) unit performs non-maximum suppression on the extracted keypoints, and a feature orientation calculation unit calculates the feature orientations of the keypoints on the smoothed image. Next, the descriptor computing unit calculates the descriptor of the feature according to the feature direction, and stores the result in the feature cache. After all computations are over, the results saved in the feature cache are sent back to the memory and sent to the feature matching module. It should be pointed out that the various computing units and caches in the feature extraction module are not run in series in the actual operation process, but run in parallel in a pipeline manner.

图3是本发明一个实施例的一种ORB-SLAM硬件加速器的特征匹配模块的结构示意图。特征匹配模块用于将特征提取模块从图像上提取的特征与全局地图中的地图点进行匹配,包含:FIG. 3 is a schematic structural diagram of a feature matching module of an ORB-SLAM hardware accelerator according to an embodiment of the present invention. The feature matching module is used to match the features extracted from the image by the feature extraction module with map points in the global map, including:

AXI接口:与特征提取模块中的AXI接口一致。AXI interface: consistent with the AXI interface in the feature extraction module.

匹配单元:用于将两组描述子(全局地图中地图点的描述子以及当从前帧中提取的特征的描述子)进行匹配。匹配单元中包含复数个汉明距离计算单元与比较器。它首先计算描述子之间的汉明距离,以此表示描述子之间的相似程度。接着,将两组描述子间相似度最高的两两匹配。Matching unit: used to match two sets of descriptors (descriptors of map points in the global map and descriptors of features currently extracted from the current frame). The matching unit includes a plurality of Hamming distance calculation units and comparators. It first calculates the Hamming distance between the descriptors, which represents the similarity between the descriptors. Next, match the two sets of descriptors with the highest similarity.

描述子缓存、结果缓存:分别用于缓存描述子和匹配结果。其中描述子缓存中存储的两组描述子(全局地图中地图点的描述子以及当从前帧中提取的特征的描述子)分别来自内存和特征提取模块。需要指出的是,为了减少数据传输开销,当且仅当全局地图进行更新时描述子缓存中存储的地图点的描述子才会更新。Descriptor cache and result cache: used to cache descriptors and matching results respectively. The two sets of descriptors stored in the descriptor cache (descriptors of map points in the global map and descriptors of features currently extracted from the previous frame) come from memory and feature extraction modules, respectively. It should be pointed out that, in order to reduce the data transmission overhead, the descriptors of map points stored in the descriptor cache will be updated if and only when the global map is updated.

图4是本发明一个实施例的一种ORB-SLAM硬件加速器的FPGA硬件加速模块与处理器系统并行运行时的流水线示意图,其中的矩形代表SLAM工作流程中的各个过程,PE代表位姿估计,PO代表位姿优化,FE代表特征提取,FM代表特征匹配,MU代表全局地图更新。此外,PS代表处理器系统。4 is a schematic diagram of a pipeline when an FPGA hardware acceleration module of an ORB-SLAM hardware accelerator and a processor system run in parallel according to an embodiment of the present invention, wherein the rectangle represents each process in the SLAM workflow, and PE represents pose estimation, PO stands for pose optimization, FE for feature extraction, FM for feature matching, and MU for global map update. Also, PS stands for Processor System.

ORB-SLAM硬件加速器在处理普通帧的时候,依次运行特征提取,特征匹配,位姿估计以及位姿优化。其中特征提取与特征匹配在FPGA硬件加速模块中运行,而位姿估计与位姿优化在处理器系统中运行。为了使FPGA硬件加速模块和处理器系统能够并行运行,提高吞吐速度,处理器系统在进行位姿估计与位姿优化的同时,FPGA硬件加速模块开始对下一帧图像进行特征提取与特征匹配。The ORB-SLAM hardware accelerator runs feature extraction, feature matching, pose estimation, and pose optimization in sequence when processing ordinary frames. Among them, feature extraction and feature matching run in the FPGA hardware acceleration module, while pose estimation and pose optimization run in the processor system. In order to enable the FPGA hardware acceleration module and the processor system to run in parallel and improve the throughput speed, while the processor system performs pose estimation and pose optimization, the FPGA hardware acceleration module starts to perform feature extraction and feature matching on the next frame of image.

ORB-SLAM硬件加速器在处理关键帧的时候,与处理普通帧的流程的不同之处在于,处理器系统在进行完位姿估计与位姿优化之后需要执行全局地图更新。由于在进行下一帧的特征匹配时需要更新后的全局地图,在处理器系统运行时,FPGA硬件加速模块只会同时运行下一帧的特征提取,并等待全局地图更新完毕之后才开始运行特征匹配。When the ORB-SLAM hardware accelerator processes key frames, the difference from the process of processing ordinary frames is that the processor system needs to perform global map update after the pose estimation and pose optimization. Since the updated global map is required for the feature matching of the next frame, when the processor system is running, the FPGA hardware acceleration module will only run the feature extraction of the next frame at the same time, and wait for the global map to be updated before starting to run the feature. match.

以上运用具体实施例对本发明进行了进一步的阐释。需要指出的是,上述内容仅为本发明的具体实施例,而不应用于限制本发明。在本发明的思想之内的任何修改、替换、改进等都应在本发明的保护范围之内。The present invention is further explained above by using specific embodiments. It should be noted that the above contents are only specific embodiments of the present invention, and should not be used to limit the present invention. Any modifications, substitutions, improvements, etc. within the spirit of the present invention should fall within the protection scope of the present invention.

Claims (6)

1. An ORB-SLAM hardware accelerator, comprising: the accelerator comprises a sensor module, an FPGA hardware acceleration module and a processor system;
the sensor module is used for acquiring image data;
the FPGA hardware acceleration module is used for extracting ORB features from the image acquired by the sensor module and matching the extracted features with map points in the global map;
the processor system is used for calculating the camera pose and maintaining the global map according to the ORB features extracted by the FPGA hardware acceleration module and the matching result of the ORB features and map points in the global map;
the FPGA hardware acceleration module comprises: the device comprises an image down-sampling module, a feature extraction module and a feature matching module;
the image down-sampling module is used for generating an image pyramid;
the feature extraction module is used for extracting ORB features on each layer of the image pyramid;
the feature matching module is used for matching the ORB features extracted by the feature extraction module with map points in the global map;
the feature extraction module includes: the device comprises a key point extracting unit, a non-maximum value inhibiting unit, a Gaussian filtering unit, a characteristic direction calculating unit, a descriptor calculating unit and a first cache unit;
the key point extracting unit is used for extracting a FAST corner on an input image and calculating a Harris response value of the FAST corner;
the non-maximum suppression unit is used for performing non-maximum suppression on the FAST corner according to the Harris response value of the FAST corner, and reserving the FAST corner with the maximum Harris response value in the neighborhood;
the Gaussian filtering unit is used for carrying out Gaussian blur processing on the input image, removing noise in the image and generating a smooth image;
the characteristic direction calculating unit is used for calculating a characteristic direction on the image after Gaussian filtering;
the descriptor calculation unit is used for calculating a BRIEF descriptor of the feature on the image after Gaussian filtering according to the feature direction;
the first cache unit is used for temporarily storing input data, intermediate calculation results and final results;
the work flow of the characteristic direction calculating unit is as follows: firstly, calculating the pixel gray scale centroid coordinates of the neighborhood where the features are located; then, calculating the ratio of the horizontal coordinate to the vertical coordinate of the gray centroid, and obtaining the direction of the features according to a lookup table;
the first cache unit includes: inputting an image cache, a smooth image cache, a response value cache and a characteristic cache; the input image cache, the smooth image cache and the response value cache adopt a ping-pong-like architecture, are composed of a plurality of same caches and can simultaneously process the input and the output of data; the feature cache adopts a maximum heap architecture, is used for screening the features while preserving the features, and only reserves partial features with large Harris response values.
2. The ORB-SLAM hardware accelerator of claim 1, wherein: each unit of the feature extraction module adopts a pipeline type computing architecture, all computing units run in parallel, only partial data is stored in a cache, and the data is discarded immediately after being used.
3. The ORB-SLAM hardware accelerator of claim 1, wherein: the feature matching module includes: the matching unit and the second cache unit;
the matching unit is used for matching the features extracted by the feature extraction module with map points in the global map; the working process is as follows: firstly, calculating the Hamming distance between any two descriptors in two sets of descriptors; then, matching the two groups of descriptors according to the Hamming distance by using a violent searching method; the two groups of descriptors refer to descriptors of map points in the global map and descriptors of features extracted from a previous frame;
the second cache unit comprises a descriptor cache and a result cache, and is used for temporarily storing two groups of descriptors to be matched and matching results.
4. The ORB-SLAM hardware accelerator of claim 1, wherein: the processor system comprises a general processor, a memory and a memory controller; and the general processor is used for calculating the camera pose according to the extracted image features and the matching relation between the features and map points in the global map, and updating and maintaining the global map.
5. The ORB-SLAM hardware accelerator of claim 4, wherein: the processor system and the FPGA hardware acceleration module communicate via an AXI bus; a general processor in the processor system can directly configure an instruction register in the FPGA hardware acceleration module through an AXI bus; the FPGA hardware acceleration module may also directly read data from the memory in the processor system through the AXI bus, or store the calculation result in the memory.
6. The ORB-SLAM hardware accelerator of claim 1, wherein: the FPGA hardware acceleration module and the processor system run in parallel in a pipeline mode.
CN201910084078.9A 2019-01-29 2019-01-29 An ORB-SLAM Hardware Accelerator Active CN109919825B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910084078.9A CN109919825B (en) 2019-01-29 2019-01-29 An ORB-SLAM Hardware Accelerator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910084078.9A CN109919825B (en) 2019-01-29 2019-01-29 An ORB-SLAM Hardware Accelerator

Publications (2)

Publication Number Publication Date
CN109919825A CN109919825A (en) 2019-06-21
CN109919825B true CN109919825B (en) 2020-11-27

Family

ID=66961065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910084078.9A Active CN109919825B (en) 2019-01-29 2019-01-29 An ORB-SLAM Hardware Accelerator

Country Status (1)

Country Link
CN (1) CN109919825B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991291B (en) * 2019-11-26 2021-09-07 清华大学 An Image Feature Extraction Method Based on Parallel Computing
CN113052750A (en) * 2021-03-31 2021-06-29 广东工业大学 Accelerator and accelerator for task tracking in VSLAM system
CN113112394A (en) * 2021-04-13 2021-07-13 北京工业大学 Visual SLAM front-end acceleration method based on CUDA technology
CN113536024B (en) * 2021-08-11 2022-09-09 重庆大学 An FPGA-based ORB_SLAM Relocation Feature Point Retrieval Acceleration Method
CN114283065B (en) * 2021-12-28 2024-06-11 北京理工大学 ORB feature point matching system and method based on hardware acceleration
CN115143960A (en) * 2022-06-27 2022-10-04 上海商汤科技开发有限公司 SLAM system, method, device, apparatus, medium, and program product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1468399A (en) * 2000-10-10 2004-01-14 纳佐米通信公司 Java hardware accelerator using microcode engine
CN102446085A (en) * 2010-10-01 2012-05-09 英特尔移动通信技术德累斯顿有限公司 Hardware accelerator module and method for setting up same
CN104062977A (en) * 2014-06-17 2014-09-24 天津大学 Full-autonomous flight control method for quadrotor unmanned aerial vehicle based on vision SLAM
CN105022401A (en) * 2015-07-06 2015-11-04 南京航空航天大学 SLAM method through cooperation of multiple quadrotor unmanned planes based on vision
CN108846867A (en) * 2018-08-29 2018-11-20 安徽云能天智能科技有限责任公司 A kind of SLAM system based on more mesh panorama inertial navigations

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400388B (en) * 2013-08-06 2016-12-28 中国科学院光电技术研究所 Method for eliminating Brisk key point error matching point pair by using RANSAC
CN108171734B (en) * 2017-12-25 2022-01-07 西安因诺航空科技有限公司 ORB feature extraction and matching method and device
CN108960251A (en) * 2018-05-22 2018-12-07 东南大学 A kind of images match description generates the hardware circuit implementation method of scale space

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1468399A (en) * 2000-10-10 2004-01-14 纳佐米通信公司 Java hardware accelerator using microcode engine
CN102446085A (en) * 2010-10-01 2012-05-09 英特尔移动通信技术德累斯顿有限公司 Hardware accelerator module and method for setting up same
CN104062977A (en) * 2014-06-17 2014-09-24 天津大学 Full-autonomous flight control method for quadrotor unmanned aerial vehicle based on vision SLAM
CN105022401A (en) * 2015-07-06 2015-11-04 南京航空航天大学 SLAM method through cooperation of multiple quadrotor unmanned planes based on vision
CN108846867A (en) * 2018-08-29 2018-11-20 安徽云能天智能科技有限责任公司 A kind of SLAM system based on more mesh panorama inertial navigations

Also Published As

Publication number Publication date
CN109919825A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
CN109919825B (en) An ORB-SLAM Hardware Accelerator
Liu et al. eslam: An energy-efficient accelerator for real-time orb-slam on fpga platform
CN110555901B (en) Method, device, equipment and storage medium for positioning and mapping dynamic and static scenes
CN105631798B (en) Low Power Consumption Portable realtime graphic object detecting and tracking system and method
WO2019105044A1 (en) Method and system for lens distortion correction and feature extraction
CN106204660A (en) A kind of Ground Target Tracking device of feature based coupling
CN107705322A (en) Motion estimate tracking and system
CN102831617A (en) Method and system for detecting and tracking moving object
CN111368759B (en) Monocular vision-based mobile robot semantic map construction system
CN111915661B (en) Point cloud registration method, system and computer readable storage medium based on RANSAC algorithm
CN111860651B (en) Monocular vision-based semi-dense map construction method for mobile robot
CN107358238A (en) A kind of method and system for extracting image feature information
Wan et al. An energy-efficient quad-camera visual system for autonomous machines on fpga platform
CN107590234A (en) A kind of method of the indoor vision positioning database redundancy information reduction based on RANSAC
CN114584785A (en) Real-time image stabilizing method and device for video image
CN115115698A (en) Pose estimation method of equipment and related equipment
Guo et al. UDTIRI: An online open-source intelligent road inspection benchmark suite
CN116563582A (en) Image template matching method and device based on domestic CPU and opencv
CN115482523A (en) Small object target detection method and system of lightweight multi-scale attention mechanism
CN103413326A (en) Method and device for detecting feature points in Fast approximated SIFT algorithm
CN118314183A (en) A multimodal image registration method, device and storage device
CN102004921A (en) Target identification method based on image characteristic analysis
CN109816709B (en) Monocular camera-based depth estimation method, device and equipment
WO2022188020A1 (en) Image processing method and apparatus, and movable platform and storage medium
CN115115605A (en) Method and system for realizing circle detection based on ZYNQ's Hough transform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant