[go: up one dir, main page]

CN117351161A - Mapping methods, devices, storage media and electronic devices based on visual semantics - Google Patents

Mapping methods, devices, storage media and electronic devices based on visual semantics Download PDF

Info

Publication number
CN117351161A
CN117351161A CN202311245393.8A CN202311245393A CN117351161A CN 117351161 A CN117351161 A CN 117351161A CN 202311245393 A CN202311245393 A CN 202311245393A CN 117351161 A CN117351161 A CN 117351161A
Authority
CN
China
Prior art keywords
frame image
key frame
target
vehicle
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311245393.8A
Other languages
Chinese (zh)
Inventor
侯晨波
王媛
王东虎
侯欢欢
杨波
刘春霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yinwo Automotive Technology Co ltd
Original Assignee
Beijing Yinwo Automotive Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yinwo Automotive Technology Co ltd filed Critical Beijing Yinwo Automotive Technology Co ltd
Priority to CN202311245393.8A priority Critical patent/CN117351161A/en
Publication of CN117351161A publication Critical patent/CN117351161A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R1/00Optical viewing arrangements; Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
    • B60R1/20Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
    • B60R1/22Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle
    • B60R1/23Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view
    • B60R1/27Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view providing all-round vision, e.g. using omnidirectional cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R2300/00Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
    • B60R2300/30Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of image processing
    • B60R2300/303Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of image processing using joined images, e.g. multiple camera images
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R2300/00Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
    • B60R2300/80Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the intended use of the viewing arrangement
    • B60R2300/802Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the intended use of the viewing arrangement for monitoring and displaying vehicle exterior blind spot views
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Mechanical Engineering (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a diagram building method and device based on visual semantics, a storage medium and electronic equipment, and relates to the field of intelligent driving. The method comprises the following steps: acquiring a plurality of paths of looking-around images of a target vehicle at the current moment acquired in a target environment; performing aerial view stitching on the multipath surrounding images to generate a first key frame image; determining a plurality of historical frame images without a vehicle bottom blind area corresponding to the first key frame image; based on a plurality of historical frame images, texture mapping is carried out on the blind areas of the vehicle bottom in the first key frame image, and a second key frame image is obtained; converting the second key frame image into a local semantic map based on the initial pose of the target vehicle; and splicing the local semantic maps of the target vehicle at a plurality of moments in the target environment to generate a global map of the target environment. According to the scheme, the influence of feature shielding and truncation caused by the visual blind area can be eliminated, and the continuity and accuracy of the semantic features of the second key frame image are ensured.

Description

基于视觉语义的建图方法、装置、存储介质及电子设备Mapping methods, devices, storage media and electronic devices based on visual semantics

技术领域Technical field

本申请涉及智能驾驶技术领域,具体涉及一种基于视觉语义的建图方法、装置、存储介质及电子设备。This application relates to the field of intelligent driving technology, and specifically to a mapping method, device, storage medium and electronic equipment based on visual semantics.

背景技术Background technique

随着智能驾驶技术的发展,车辆中普遍配置了全景环视影像系统,从而让驾驶员更好地了解车辆周边的环境情况,然而,目前的全景环视影像系统仍然存在车底这一视觉盲区。With the development of intelligent driving technology, vehicles are commonly equipped with panoramic surround-view imaging systems, allowing drivers to better understand the environment around the vehicle. However, current panoramic surround-view imaging systems still have a visual blind spot under the car.

相关应用中,针对全景环视拼接图,可以利用深度神经网络进行静态道路标识元素的分割,将分割结果当做图像间匹配的特征。但是当全景环视拼接图存在车底盲区时,会使相关特征被截断或者完全被遮挡,从而增加特征匹配过程中的错误率,使建图的误差增大甚至失败。In related applications, for panoramic mosaics, deep neural networks can be used to segment static road sign elements, and the segmentation results can be used as matching features between images. However, when there is a blind spot under the car in the panoramic surround view spliced image, the relevant features will be truncated or completely blocked, thereby increasing the error rate in the feature matching process, causing the mapping error to increase or even fail.

发明内容Contents of the invention

有鉴于此,本申请实施例提供了一种基于视觉语义的建图方法、装置、存储介质及电子设备。In view of this, embodiments of the present application provide a mapping method, device, storage medium and electronic device based on visual semantics.

第一方面,本申请一实施例提供了一种基于视觉语义的建图方法,包括:获取目标车辆在目标环境中采集的当前时刻的多路环视图像;对多路环视图像进行鸟瞰拼接,生成第一关键帧图像,第一关键帧图像中包括车底盲区;确定第一关键帧图像对应的无车底盲区的多幅历史帧图像;基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像;基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图;对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。In the first aspect, an embodiment of the present application provides a mapping method based on visual semantics, which includes: obtaining multi-channel surround images of the target vehicle at the current moment collected in the target environment; performing bird's-eye view splicing of the multi-channel surround images to generate A first key frame image, the first key frame image includes a blind spot under the car; determine a plurality of historical frame images without a blind spot under the car corresponding to the first key frame image; based on the multiple historical frame images, calculate the blind spots in the first key frame image Texture mapping is performed on the blind area under the vehicle to obtain the second key frame image without the blind area under the vehicle; based on the initial pose of the target vehicle, the second key frame image is converted into a local semantic map; multiple images of the target vehicle in the target environment are The local semantic maps at each moment are spliced together to generate a global map of the target environment.

结合第一方面,在第一方面的某些实现方式中,基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像,包括:在多幅历史帧图像中确定用于贴图第一关键帧图像中的车底盲区的参考帧图像;确定目标车辆在当前时刻的轮速脉冲计数值、目标车辆在参考帧图像对应的历史时刻的轮速脉冲计数值;基于目标车辆在当前时刻的轮速脉冲计数值、以及历史时刻的轮速脉冲计数值,确定目标车辆在第一关键帧图像和参考帧图像中的旋转角度和偏移量;基于旋转角度和偏移量,在参考帧图像中确定有效贴图区域;基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到第二关键帧图像。Combined with the first aspect, in some implementations of the first aspect, texture mapping is performed on the blind area under the vehicle in the first key frame image based on multiple historical frame images to obtain a second key frame image without the blind area under the vehicle, Including: determining the reference frame image used to map the blind spot under the vehicle in the first key frame image among multiple historical frame images; determining the wheel speed pulse count value of the target vehicle at the current moment, and the corresponding history of the target vehicle in the reference frame image. The wheel speed pulse count value of the target vehicle at the current moment; based on the wheel speed pulse count value of the target vehicle at the current moment and the wheel speed pulse count value of the historical moment, determine the rotation angle and deflection of the target vehicle in the first key frame image and the reference frame image. Shift amount; based on the rotation angle and offset, determine the effective mapping area in the reference frame image; based on the effective mapping area, texture map the blind area under the car in the first key frame image to obtain the second key frame image.

结合第一方面,在第一方面的某些实现方式中,在多幅历史帧图像中确定用于贴图第一关键帧图像中的车底盲区的参考帧图像,包括:基于目标车辆的行驶参数,分别确定多幅历史帧图像与第一关键帧图像的变换矩阵;基于目标采样误差,以第一关键帧图像中的特征信息为标准,对多幅历史帧图像各自对应的特征信息进行随机采样,确定多幅历史帧图像各自对应的采样值,采样值表征历史帧图像中的特征信息与第一关键帧图像中的特征信息的匹配度;基于多幅历史帧图像与第一关键帧图像的变换矩阵、多幅历史帧图像各自对应的采样值,从多幅历史帧图像中确定参考帧图像。In conjunction with the first aspect, in some implementations of the first aspect, determining a reference frame image for mapping the vehicle bottom blind spot in the first key frame image among multiple historical frame images includes: based on the driving parameters of the target vehicle , respectively determine the transformation matrices of multiple historical frame images and the first key frame image; based on the target sampling error and using the feature information in the first key frame image as the standard, randomly sample the corresponding feature information of the multiple historical frame images , determine the corresponding sampling values of multiple historical frame images, and the sampling values represent the matching degree of the feature information in the historical frame image and the feature information in the first key frame image; based on the multiple historical frame images and the first key frame image The transformation matrix and corresponding sampling values of multiple historical frame images are used to determine the reference frame image from multiple historical frame images.

结合第一方面,在第一方面的某些实现方式中,基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到第二关键帧图像,包括:确定车底盲区的轮廓线;沿着轮廓线,向远离车底盲区的方向外扩目标距离得到外边线,将轮廓线和外边线组成的区域作为过渡区域;基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到待校准图像;确定待校准图像中的有效贴图区域的灰度梯度值;基于灰度梯度值,对待校准图像中的过渡区域的灰度值进行校准,得到第二关键帧图像。Combined with the first aspect, in some implementations of the first aspect, texture mapping is performed on the blind area under the vehicle in the first key frame image based on the effective mapping area to obtain a second key frame image, including: determining the blind area under the vehicle Contour line; along the contour line, expand the target distance away from the blind area under the car to obtain the outer edge line, and use the area composed of the contour line and the outer edge line as the transition area; based on the effective mapping area, map the car in the first key frame image Perform texture mapping on the bottom blind area to obtain the image to be calibrated; determine the gray gradient value of the effective mapping area in the image to be calibrated; based on the gray gradient value, calibrate the gray value of the transition area in the image to be calibrated to obtain the second key frame image.

结合第一方面,在第一方面的某些实现方式中,基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图,包括:对第二关键帧图像进行语义特征提取,得到第二关键帧图像中包含的目标对象的语义特征,语义特征包括位置特征、方向特征和类别特征;确定目标车辆在当前时刻的位姿;基于第二关键帧图像中包含的目标对象的语义特征、目标车辆在当前时刻的位姿和初始位姿,将第二关键帧图像转化为局部语义地图。Combined with the first aspect, in some implementations of the first aspect, based on the initial pose of the target vehicle, converting the second key frame image into a local semantic map includes: extracting semantic features from the second key frame image to obtain The semantic features of the target object contained in the second key frame image. The semantic features include position features, direction features and category features; determine the pose of the target vehicle at the current moment; based on the semantic features of the target object contained in the second key frame image. , the pose and initial pose of the target vehicle at the current moment, and convert the second key frame image into a local semantic map.

结合第一方面,在第一方面的某些实现方式中,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图,包括:在多个时刻的局部语义地图中确定至少两幅待匹配的局部语义地图;分别在至少两幅待匹配的局部语义地图中确定目标点云和源点云;基于目标点云的几何特征,确定目标点云中的每个点到源点云的最邻近点;基于目标点云中的每个点到源点云的最邻近点,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。Combined with the first aspect, in some implementations of the first aspect, local semantic maps of the target vehicle at multiple moments in the target environment are spliced to generate a global map of the target environment, including: local semantics at multiple moments Determine at least two local semantic maps to be matched in the map; determine the target point cloud and source point cloud in at least two local semantic maps to be matched; determine each point cloud in the target point cloud based on the geometric characteristics of the target point cloud point to the nearest point of the source point cloud; based on the nearest point of each point in the target point cloud to the source point cloud, the local semantic maps of the target vehicle at multiple moments in the target environment are spliced to generate a map of the target environment. Global map.

结合第一方面,在第一方面的某些实现方式中,基于视觉语义的建图方法还包括:基于目标车辆的初始位姿,对全局地图中的特征进行匹配,确定目标车辆的当前位姿。Combined with the first aspect, in some implementations of the first aspect, the mapping method based on visual semantics also includes: matching features in the global map based on the initial pose of the target vehicle, and determining the current pose of the target vehicle. .

第二方面,本申请一实施例提供了一种基于视觉语义的建图装置,包括:获取模块,用于获取目标车辆在目标环境中采集的当前时刻的多路环视图像;第一拼接模块,用于对多路环视图像进行鸟瞰拼接,生成第一关键帧图像,第一关键帧图像中包括车底盲区;确定模块,用于确定第一关键帧图像对应的无车底盲区的多幅历史帧图像;贴图模块,用于基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像;转化模块,用于基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图;第二拼接模块,用于对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。In the second aspect, an embodiment of the present application provides a mapping device based on visual semantics, including: an acquisition module for acquiring multi-channel surround images of the target vehicle at the current moment collected in the target environment; a first splicing module, Used to perform bird's-eye splicing of multi-channel surround-view images to generate a first key frame image, which includes a blind spot under the car; a determination module used to determine multiple histories of no blind spots under the car corresponding to the first key frame image Frame image; mapping module, used to texture map the vehicle bottom blind area in the first key frame image based on multiple historical frame images, to obtain a second key frame image without vehicle bottom blind area; conversion module, used to perform texture mapping based on the target vehicle The initial pose converts the second key frame image into a local semantic map; the second splicing module is used to splice the local semantic maps of the target vehicle at multiple moments in the target environment to generate a global map of the target environment.

第三方面,本申请一实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序用于执行第一方面所述的方法。In a third aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program is used to execute the method described in the first aspect.

第四方面,本申请一实施例提供了一种电子设备,该电子设备包括:处理器;用于存储处理器可执行指令的存储器;该处理器用于执行第一方面所述的方法。In a fourth aspect, an embodiment of the present application provides an electronic device. The electronic device includes: a processor; a memory used to store instructions executable by the processor; and the processor is used to execute the method described in the first aspect.

在本实施例中,通过运动补偿的方式,将历史帧图像根据特定的变换条件贴图到第一关键帧图像的车底盲区,消除了视觉盲区带来的特征遮挡和截断影响,保证了第二关键帧图像的语义特征的连续性和准确性,避免建图过程中出现特征截断或者丢失,为用户驾驶提供辅助,提升了用户体验。此外,本申请无需安装额外的摄像头模组,成本低廉。In this embodiment, through motion compensation, the historical frame image is mapped to the blind area of the car bottom of the first key frame image according to specific transformation conditions, eliminating the feature occlusion and truncation effects caused by the visual blind area, and ensuring the second The continuity and accuracy of the semantic features of key frame images avoid feature truncation or loss during the mapping process, provide assistance to users in driving, and improve the user experience. In addition, this application does not require the installation of additional camera modules and is low cost.

附图说明Description of drawings

通过结合附图对本申请实施例进行更详细地描述,本申请的上述以及其他目的、特征和优势将变得更加明显。附图用来提供对本申请实施例的进一步理解,并且构成说明书的一部分,与本申请实施例一起用于解释本申请,并不构成对本申请的限制。在附图中,相同的参考标号通常代表相同部件或步骤。The above and other objects, features and advantages of the present application will become more apparent by describing the embodiments of the present application in more detail in conjunction with the accompanying drawings. The drawings are used to provide further understanding of the embodiments of the present application, and constitute a part of the specification. They are used to explain the present application together with the embodiments of the present application, and do not constitute a limitation of the present application. In the drawings, like reference numbers generally represent like components or steps.

图1所示为本申请一示例性实施例提供的建图的流程示意图。Figure 1 shows a schematic flow chart of mapping provided by an exemplary embodiment of the present application.

图2所示为本申请一示例性实施例提供的得到第二关键帧图像的流程示意图。Figure 2 shows a schematic flowchart of obtaining a second key frame image provided by an exemplary embodiment of the present application.

图3所示为本申请一示例性实施例提供的转化为局部语义地图的流程示意图。Figure 3 shows a schematic flowchart of conversion into a local semantic map provided by an exemplary embodiment of the present application.

图4所示为本申请一示例性实施例提供的生成全局地图的流程示意图。Figure 4 shows a schematic flowchart of generating a global map provided by an exemplary embodiment of the present application.

图5所示为本申请一实施例提供的基于语义地图的建图装置的结构示意图。Figure 5 shows a schematic structural diagram of a semantic map-based mapping device provided by an embodiment of the present application.

图6所示为本申请一实施例提供的电子设备的结构示意图。FIG. 6 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

传统SLAM(Simultaneous Localization and Mapping,即时定位与地图构建)技术仅包含一些低级信息,无法满足现代计算机视觉的发展,随着人工智能概念的兴起,利用神经网络技术实现图像的分类、检测、分割等方面胜过传统的图像处理,并在自动驾驶、机器人、无人机、医疗等行业初步展现出巨大的优势。与传统的vSLAM(visual SimultaneousLocalization and Mapping,视觉同时定位与建图)相比,语义vSLAM不仅可以获取环境中的几何结构信息,还可以提取独立对象的语义信息。在建图中,语义信息提供了丰富的对象信息来构建不同类型的语义地图,如像素级地图和对象级地图。在定位中,语义vSLAM借助语义约束提高了定位的准确性和鲁棒性。因此,语义vSLAM可以帮助机器人提高对未知复杂环境的准确感知和适应能力,以执行更复杂的任务。但是,当局部语义地图中存在车底盲区时,会使相关特征被截断或者完全被遮挡,从而增加特征匹配过程中的错误率,使建图误差增大甚至失败。Traditional SLAM (Simultaneous Localization and Mapping, instant positioning and map construction) technology only contains some low-level information and cannot meet the development of modern computer vision. With the rise of the concept of artificial intelligence, neural network technology is used to achieve image classification, detection, segmentation, etc. It outperforms traditional image processing in many aspects, and has initially shown huge advantages in industries such as autonomous driving, robots, drones, and medical care. Compared with traditional vSLAM (visual Simultaneous Localization and Mapping, visual simultaneous localization and mapping), semantic vSLAM can not only obtain geometric structure information in the environment, but also extract semantic information of independent objects. In mapping, semantic information provides rich object information to build different types of semantic maps, such as pixel-level maps and object-level maps. In positioning, semantic vSLAM improves the accuracy and robustness of positioning with the help of semantic constraints. Therefore, semantic vSLAM can help robots improve their accurate perception and adaptability to unknown complex environments to perform more complex tasks. However, when there is a blind spot under the vehicle in the local semantic map, the relevant features will be truncated or completely blocked, thereby increasing the error rate in the feature matching process, causing the mapping error to increase or even fail.

有鉴于此,本申请提供了一种基于视觉语义的建图方法。基于获取的第一关键帧图像和车身运动信息执行运动补偿,从而将语义特征填充到第一关键帧图像的车底盲区,最终使其语义信息更全面,更鲁棒,特征匹配的成功率更高。In view of this, this application provides a mapping method based on visual semantics. Motion compensation is performed based on the acquired first key frame image and vehicle body motion information, thereby filling the semantic features into the blind area of the vehicle bottom of the first key frame image, ultimately making the semantic information more comprehensive and robust, and the success rate of feature matching higher. high.

图1所示为本申请一示例性实施例提供的建图的流程示意图。如图1所示,在本申请实施例中,基于视觉语义的建图方法包括如下步骤。Figure 1 shows a schematic flow chart of mapping provided by an exemplary embodiment of the present application. As shown in Figure 1, in the embodiment of the present application, the mapping method based on visual semantics includes the following steps.

步骤S110,获取目标车辆在目标环境中采集的当前时刻的多路环视图像。Step S110: Obtain the multi-channel surround image of the target vehicle at the current moment collected in the target environment.

示例性地,利用目标车辆的全景环视影像系统生成全景环视俯视图。例如,将通过安装在目标车辆四周的多路摄像头获取到的视频流,输入到全景环视影像系统中进行预处理,得到多路环视图像;或者,将安装在目标车辆四周的摄像头a、摄像头b、摄像头c和摄像头d拍摄的多张图像作为多路环视图像,多张图像可以为二维图像,也可以为三维图像,本申请实施例对此不做限定。For example, the panoramic surround view image system of the target vehicle is used to generate the panoramic surround view top view. For example, the video stream obtained through multiple cameras installed around the target vehicle is input into the panoramic surround view imaging system for preprocessing to obtain a multichannel surround view image; or, camera a and camera b installed around the target vehicle are , the multiple images captured by camera c and camera d are used as multi-channel surround images. The multiple images can be two-dimensional images or three-dimensional images. This is not limited in the embodiment of the present application.

步骤S120,对多路环视图像进行鸟瞰拼接,生成第一关键帧图像。Step S120: perform bird's-eye view splicing of multi-channel surround images to generate a first key frame image.

第一关键帧图像中包括车底盲区。示例性地,首先对目标车辆不同方向和位置的摄像头进行鸟瞰变化后,再进行拼接,得到目标车辆及周围区域的鸟瞰视角的第一关键帧图像。更具体地,目标车辆中的摄像头的安装一般是斜向下的角度,因此,摄像头输出的原始的环视图像并不是俯视图,为了达到鸟瞰图的效果,不同的摄像头输出的环视图像需要投影到新的俯视视角。示例性地,可以用一个矩阵来表示摄像头输出的原始的环视图像中的一个点的变化,具体地,其表达式如下所示。之后,通过标定将不同路的环视图像拼接,得到第一关键帧图像。The first key frame image includes the blind area under the car. For example, first, the bird's-eye view of the cameras in different directions and positions of the target vehicle is changed, and then spliced to obtain the first key frame image of the bird's-eye view of the target vehicle and the surrounding area. More specifically, the cameras in the target vehicle are generally installed at an oblique downward angle. Therefore, the original surround image output by the camera is not a top view. In order to achieve the effect of a bird's eye view, the surround images output by different cameras need to be projected onto the new top view. For example, a matrix can be used to represent the change of a point in the original surround image output by the camera. Specifically, its expression is as follows. After that, the surround images of different paths are spliced through calibration to obtain the first key frame image.

步骤S130,确定第一关键帧图像对应的无车底盲区的多幅历史帧图像。Step S130: Determine multiple historical frame images without vehicle bottom blind spots corresponding to the first key frame image.

上述第一关键帧图像与多幅历史帧图像在时序上可以为连续的,也可以为不连续的。在第一关键帧图像与多幅历史帧图像为不连续的情况下,每两帧之间的时间间隔需要在预设范围内,防止两帧点云数据之间的区别过大,提高后续继承多幅历史帧图像的目标对象信息的时效性。需要说明的是,第一关键帧图像为当前时刻拍摄的图像,多幅历史帧图像为在当前时刻之前拍摄的图像。The above-mentioned first key frame image and the multiple historical frame images may be continuous in time sequence, or may be discontinuous. When the first key frame image and multiple historical frame images are discontinuous, the time interval between each two frames needs to be within the preset range to prevent the difference between the two frames of point cloud data from being too large and improve subsequent inheritance. Timeliness of target object information of multiple historical frame images. It should be noted that the first key frame image is an image taken at the current time, and the multiple historical frame images are images taken before the current time.

步骤S140,基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像。Step S140: Based on multiple historical frame images, perform texture mapping on the blind area under the vehicle in the first key frame image to obtain a second key frame image without the blind area under the vehicle.

示例性地,在多幅历史帧图像中选择与第一关键帧图像相关性最高的历史帧图像,并在历史帧图像中确定用于纹理贴图的区域。获取全景环视影像系统对应的三维投影模型,获取三维投影模型的模型点在世界坐标系下的世界坐标,计算前述区域对应的世界坐标,使用纹理贴图,将第一关键帧图像和前述区域贴到三维投影模型上,拼接后得到三维全景图。进一步地,根据全景环视系统的摄像头的内外参数将三维全景图转化为二维的第二关键帧图像。第二关键帧图像与第一关键帧图像相同,均为俯视视角的图像。For example, the historical frame image with the highest correlation with the first key frame image is selected from multiple historical frame images, and an area used for texture mapping is determined in the historical frame image. Obtain the three-dimensional projection model corresponding to the panoramic surround-view imaging system, obtain the world coordinates of the model points of the three-dimensional projection model in the world coordinate system, calculate the world coordinates corresponding to the aforementioned area, and use texture mapping to paste the first key frame image and the aforementioned area to On the three-dimensional projection model, a three-dimensional panorama is obtained after splicing. Further, the three-dimensional panoramic image is converted into a two-dimensional second key frame image according to the internal and external parameters of the camera of the panoramic surround view system. The second key frame image is the same as the first key frame image, and both are images from a bird's eye view.

步骤S150,基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图。Step S150: Convert the second key frame image into a local semantic map based on the initial pose of the target vehicle.

例如,利用目标车辆上安装的传感器获取目标车辆在目标环境中开始启动时的初始位姿。初始位姿包括初始位置数据和初始姿态数据,初始位置数据可以是绝对位置数据(例如,GPS(Global Positioning System,全球卫星定位系统)直接得到的经纬度),也可以是相对位置数据(例如,由目标车辆车轮的转动圈数、速度等得到的累计运动参数,诸如距离等)。初始姿态数据可以是通过差分GPS而获得的绝对朝向角度,也可以是相对朝向角度(例如,由目标车辆车轮的转动角度等得到的累计运动参数,诸如方向等)。For example, the sensor installed on the target vehicle is used to obtain the initial pose of the target vehicle when it starts to start in the target environment. The initial pose includes initial position data and initial attitude data. The initial position data can be absolute position data (for example, longitude and latitude directly obtained by GPS (Global Positioning System)), or relative position data (for example, obtained by Accumulated motion parameters obtained from the number of rotations and speed of the target vehicle's wheels, such as distance, etc.). The initial attitude data may be an absolute orientation angle obtained through differential GPS, or a relative orientation angle (for example, accumulated motion parameters, such as direction, obtained from the rotation angle of the target vehicle's wheels, etc.).

示例性地,可以根据目标车辆启动时的初始位姿和IMU(Inertial MeasurementUnit,惯性测量单元)所得到的累计运动距离和方向等参数来确定该目标车辆的局部位姿。由于这种位姿仅仅依靠运动变化获得,所以它是一种相对局部位姿。For example, the local posture of the target vehicle can be determined based on the initial posture when the target vehicle is started and parameters such as the accumulated movement distance and direction obtained by the IMU (Inertial Measurement Unit). Since this pose is obtained only by motion changes, it is a relatively local pose.

进一步地,对第二关键帧图像进行检测跟踪识别,根据检测跟踪识别的结果来确定目标环境中的语义实体(等同于目标对象)。根据目标车辆的初始位姿、目标环境中的语义实体和目标车辆的局部姿态,将第二关键帧图像转化为局部语义地图。Further, detection, tracking and recognition are performed on the second key frame image, and the semantic entity (equal to the target object) in the target environment is determined according to the result of the detection, tracking and recognition. According to the initial pose of the target vehicle, the semantic entities in the target environment and the local pose of the target vehicle, the second key frame image is converted into a local semantic map.

步骤S160,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。Step S160: Splice local semantic maps of the target vehicle at multiple times in the target environment to generate a global map of the target environment.

将不同时刻的局部语义地图中的语义实体进行匹配,以实现相同语义实体的同一位置的拼接,进而得到目标环境的全局地图。Match the semantic entities in the local semantic maps at different times to achieve the splicing of the same semantic entities at the same position, and then obtain the global map of the target environment.

在本实施例中,通过运动补偿的方式,将历史帧图像根据特定的变换条件贴图到第一关键帧图像的车底盲区,消除视觉盲区带来的遮挡和截断影响,保证了第二关键帧图像的语义特征的连续性和准确性,避免建图过程中出现特征截断或者丢失,为用户驾驶提供辅助,提升了用户体验。并且,本申请无需安装额外的摄像头模组,成本低廉。In this embodiment, through motion compensation, the historical frame image is mapped to the blind area of the car bottom of the first key frame image according to specific transformation conditions, eliminating the occlusion and truncation effects caused by the visual blind area, and ensuring that the second key frame The continuity and accuracy of the semantic features of the image avoid feature truncation or loss during the mapping process, provide assistance to users in driving, and improve the user experience. Moreover, this application does not require the installation of additional camera modules and is low cost.

在图1所示实施例的基础上,还可以基于目标车辆的初始位姿,对全局地图中的特征进行匹配,确定目标车辆的当前位姿。Based on the embodiment shown in FIG. 1 , features in the global map can also be matched based on the initial pose of the target vehicle to determine the current pose of the target vehicle.

具体地,在本实施例中,目标车辆的初始位姿是指在目标车辆行驶过程中,由里程计采集到的目标车辆在当前时刻的位姿。Specifically, in this embodiment, the initial pose of the target vehicle refers to the pose of the target vehicle at the current moment collected by the odometer while the target vehicle is traveling.

进一步地,根据全局地图,获取特征真值数据库。在目标车辆行驶过程中,根据目标车辆采集的多路环视图像,确定图像中包含的目标对象的语义特征。根据目标对象的语义特征,在特征真值数据库中查找到定位特征位置,例如,将目标对象的语义特征转换到全局地图对应的全局坐标系下,然后将目标对象的语义特征和全局地图中的特征真值进行特征类型的匹配。之后,将定位特征位置进行位姿解算,得到解算的位姿,例如,通过定位特征位置中的静态特征点信息,利用三角定位计算静态特征点信息的当前全局位置及到达激光雷达的相对位置,从而求解激光雷达的当前全局位置,并通过定位特征位置求得车辆的当前位置,进而得到解算的位姿。Further, based on the global map, a feature true value database is obtained. While the target vehicle is driving, the semantic features of the target object contained in the image are determined based on the multi-channel surround images collected by the target vehicle. According to the semantic features of the target object, the location feature position is found in the feature truth value database. For example, the semantic features of the target object are converted to the global coordinate system corresponding to the global map, and then the semantic features of the target object are combined with those in the global map. Feature true values perform feature type matching. After that, the positioning feature position is solved for pose to obtain the solved pose. For example, by positioning the static feature point information in the positioning feature position, using triangulation positioning to calculate the current global position of the static feature point information and the relative distance to the lidar. position, thereby solving the current global position of the lidar, and obtaining the current position of the vehicle by locating the feature position, and then obtaining the solved pose.

此外,将卫星定位系统采集的目标车辆的全局位置、目标车辆的解算的位姿和目标车辆的初始位姿进行融合,将融合结果作为目标车辆的当前位姿。In addition, the global position of the target vehicle collected by the satellite positioning system, the calculated pose of the target vehicle, and the initial pose of the target vehicle are fused, and the fusion result is used as the current pose of the target vehicle.

在本实施例中,是通过多路环视图像得到目标环境的特征,然后在全局地图中对特征进行配准,因为全局信息已知,所以查找速度快。得到了静态全局位置后,根据目标车辆和静态全局位置的几何关系,推演出目标车辆的全局位置信息,提高了定位的准确度。In this embodiment, the characteristics of the target environment are obtained through multi-channel surround images, and then the characteristics are registered in the global map. Because the global information is known, the search speed is fast. After obtaining the static global position, the global position information of the target vehicle is deduced based on the geometric relationship between the target vehicle and the static global position, thereby improving the positioning accuracy.

图2所示为本申请一示例性实施例提供的得到第二关键帧图像的流程示意图。在图1所示实施例的基础上延伸出图2所示实施例,下面着重叙述图2所示实施例与图1所示实施例的不同之处,相同之处不再赘述。Figure 2 shows a schematic flowchart of obtaining a second key frame image provided by an exemplary embodiment of the present application. The embodiment shown in FIG. 2 is extended based on the embodiment shown in FIG. 1. The following focuses on the differences between the embodiment shown in FIG. 2 and the embodiment shown in FIG. 1, and the similarities will not be described again.

如图2所示,在本申请实施例中,基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像,包括如下步骤。As shown in Figure 2, in the embodiment of the present application, based on multiple historical frame images, texture mapping is performed on the blind area under the vehicle in the first key frame image to obtain a second key frame image without the blind area under the vehicle, including the following steps .

步骤S210,在多幅历史帧图像中确定用于贴图第一关键帧图像中的车底盲区的参考帧图像。Step S210: Determine a reference frame image for mapping the blind area under the vehicle in the first key frame image among multiple historical frame images.

在一种实现方式中,针对第一关键帧图像的车底盲区,多幅历史帧图像为按时间顺序排序的、并且均能提供部分或全部的车底盲区对应的有效贴图区域。判断提供最大的有效贴图区域的历史帧图像,并将其作为参考帧图像,以避免太多幅历史帧图像对车底盲区进行组合贴图造成的显示效果差的问题。In one implementation, for the blind area under the vehicle of the first key frame image, multiple historical frame images are sorted in chronological order, and each can provide effective mapping areas corresponding to part or all of the blind area under the vehicle. Determine the historical frame image that provides the largest effective mapping area and use it as a reference frame image to avoid the problem of poor display effects caused by combining mapping of too many historical frame images to the blind area under the car.

若多幅历史帧图像均不能提供全部的有效贴图区域,还可以从多幅历史帧图像中选择与第一关键帧图像的车底盲区对应的像素点最多的一幅历史帧图像作为参考帧图像,以获取有效贴图区域。这种选择较大面积的历史帧图像对车底盲区进行填充,可以尽可能地减少车底盲区的拼接次数和拼接数量,有效地避免了多图拼接引起的图像错位、亮度不一致的问题,提高了车底盲区的展示效果。If multiple historical frame images cannot provide all the effective mapping areas, you can also select the historical frame image with the most pixels corresponding to the vehicle bottom blind area of the first key frame image from the multiple historical frame images as the reference frame image. , to obtain the effective map area. This method of selecting a larger area of historical frame images to fill the blind spots under the car can reduce the splicing times and the number of splicing in the blind spots under the car as much as possible, effectively avoid the problems of image dislocation and brightness inconsistency caused by multi-image splicing, and improve The display effect of the blind area under the car is improved.

在另一种实现方式中,还可以基于目标车辆的行驶参数,分别确定多幅历史帧图像与第一关键帧图像的变换矩阵;基于目标采样误差,以第一关键帧图像中的特征信息为标准,对多幅历史帧图像各自对应的特征信息进行随机采样,确定多幅历史帧图像各自对应的采样值;基于多幅历史帧图像与第一关键帧图像的变换矩阵、多幅历史帧图像各自对应的采样值,从多幅历史帧图像中确定参考帧图像。In another implementation, the transformation matrices of multiple historical frame images and the first key frame image can also be determined based on the driving parameters of the target vehicle; based on the target sampling error, the feature information in the first key frame image is Standard, randomly sample the corresponding feature information of multiple historical frame images to determine the corresponding sampling values of multiple historical frame images; based on the transformation matrix of multiple historical frame images and the first key frame image, multiple historical frame images The respective corresponding sampling values determine the reference frame image from multiple historical frame images.

行驶参数包括方向盘转角信息和车速信息,采样值表征历史帧图像中的特征信息与第一关键帧图像中的特征信息的匹配度。进一步地,针对多幅历史帧图像中的每幅历史帧图像,确定历史帧图像在该采样值下、利用其对应的变换矩阵消耗的计算资源(以数值表示)。确定其计算资源和采样值分别对应的权重,并根据权重,得到该历史帧图像的分值,将分值最高的历史帧图像确定为参考帧图像。分值越高,表明其对应的历史帧图像与第一关键帧图像的特征相似性越高,后续利用它进行纹理贴图时,所消耗的计算资源较少,保证了参考帧图像与第一关键帧图像的特征相似度的同时,减少了后续进行纹理贴图时所消耗的资源。Driving parameters include steering wheel angle information and vehicle speed information, and the sampling values represent the matching degree between the feature information in the historical frame image and the feature information in the first key frame image. Further, for each historical frame image in the plurality of historical frame images, the computing resources (expressed in numerical values) consumed by the historical frame image using its corresponding transformation matrix under the sampling value are determined. The weights corresponding to the computing resources and sampling values are determined, and based on the weights, the score of the historical frame image is obtained, and the historical frame image with the highest score is determined as the reference frame image. The higher the score, the higher the feature similarity between the corresponding historical frame image and the first key frame image. When using it for texture mapping later, less computing resources are consumed, ensuring that the reference frame image is consistent with the first key frame image. It not only improves the feature similarity of frame images, but also reduces the resources consumed in subsequent texture mapping.

步骤S220,确定目标车辆在当前时刻的轮速脉冲计数值、目标车辆在参考帧图像对应的历史时刻的轮速脉冲计数值。Step S220: Determine the wheel speed pulse count value of the target vehicle at the current time and the wheel speed pulse count value of the target vehicle at the historical time corresponding to the reference frame image.

车轮脉冲计数值是指车轮转动一圈的脉冲数。例如,车轮转动一圈可以获取1080个脉冲。The wheel pulse count value refers to the number of pulses for one revolution of the wheel. For example, one revolution of the wheel can obtain 1080 pulses.

步骤S230,基于目标车辆在当前时刻的轮速脉冲计数值、以及历史时刻的轮速脉冲计数值,确定目标车辆在第一关键帧图像和参考帧图像中的旋转角度和偏移量。Step S230: Determine the rotation angle and offset of the target vehicle in the first key frame image and the reference frame image based on the wheel speed pulse count value of the target vehicle at the current time and the wheel speed pulse count value at the historical time.

示例性地,利用惯导原理计算目标车辆在参考帧图像与第二关键帧图像上的相对偏移量和旋转角度。惯导,即惯性导航系统,是一种不依赖于外部信息、也不向外部辐射能量的自主式导航系统。惯导的基本工作原理是以牛顿力学定律为基础,通过测量载体在惯性参考系的加速度,将它对时间进行积分,且把它变换到导航坐标系中,就能够得到在导航坐标系中的速度、偏航角和位置等信息。For example, the inertial navigation principle is used to calculate the relative offset and rotation angle of the target vehicle on the reference frame image and the second key frame image. Inertial navigation, or inertial navigation system, is an autonomous navigation system that does not rely on external information and does not radiate energy to the outside. The basic working principle of inertial navigation is based on Newton's laws of mechanics. By measuring the acceleration of the carrier in the inertial reference system, integrating it over time, and transforming it into the navigation coordinate system, we can obtain the acceleration in the navigation coordinate system. Information such as speed, yaw angle and position.

步骤S240,基于旋转角度和偏移量,在参考帧图像中确定有效贴图区域。Step S240: Determine the effective mapping area in the reference frame image based on the rotation angle and offset.

将参考帧图像覆盖在第一关键帧图像上,并使得参考帧图像中的目标车辆相对于第一关键帧图像的目标车辆以上述旋转角度进行旋转,并以上述偏移量进行偏移,从而得到映射在第一关键帧图像上的车底盲区的参考帧图像中的有效贴图区域。示例性地,图像B为参考帧图像,图像A为第一关键帧图像。根据相对偏移量和旋转角度,以图像B中的目标车辆为基点,对图像B进行偏移和旋转后,从而将图像B映射至图像A中,即得到有效贴图区域。The reference frame image is overlaid on the first key frame image, and the target vehicle in the reference frame image is rotated by the above rotation angle relative to the target vehicle in the first key frame image, and is offset by the above offset amount, so that Obtain the effective mapping area in the reference frame image of the vehicle bottom blind area mapped on the first key frame image. For example, image B is the reference frame image, and image A is the first key frame image. According to the relative offset and rotation angle, with the target vehicle in image B as the base point, image B is offset and rotated, so that image B is mapped to image A, and the effective mapping area is obtained.

步骤S250,基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到第二关键帧图像。Step S250: Based on the effective mapping area, perform texture mapping on the blind area under the vehicle in the first key frame image to obtain a second key frame image.

在一种实现方式中,确定第一关键帧图像的车底盲区的轮廓线;沿着轮廓线,向远离车底盲区的方向外扩目标距离得到外边线,将轮廓线和外边线组成的区域作为过渡区域;基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到待校准图像;确定待校准图像中的有效贴图区域的灰度梯度值;基于灰度梯度值,对待校准图像中的过渡区域的灰度值进行校准,得到第二关键帧图像。In one implementation, the contour line of the vehicle bottom blind area of the first key frame image is determined; along the contour line, the target distance is expanded in a direction away from the vehicle bottom blind area to obtain the outer edge line, and the area composed of the contour line and the outer edge line is As a transition area; based on the effective mapping area, texture map the blind area under the car in the first key frame image to obtain the image to be calibrated; determine the gray gradient value of the effective mapping area in the image to be calibrated; based on the gray gradient value, Calibrate the gray value of the transition area in the image to be calibrated to obtain the second key frame image.

可以理解的是,过渡区域的形状在此不作限制,只要能够使过渡区域的内框线与车底盲区的轮廓线重合即可。It can be understood that the shape of the transition area is not limited here, as long as the inner frame line of the transition area can overlap with the outline line of the blind area under the vehicle.

通过设置过渡区域,并对过渡区域内的像素点的灰度梯度进行调整,能够消除第二关键帧图像中有效贴图区域与其他区域之间的融合边界的视觉效果。By setting the transition area and adjusting the gray gradient of the pixels in the transition area, the visual effect of the fusion boundary between the effective map area and other areas in the second key frame image can be eliminated.

在本实施例中,从多幅历史帧图像中确定参考帧图像,可以选择与第一关键帧图像在某种程度上具有相似性的图像,保证后续对车底盲区进行纹理贴图的特征的一致性。根据旋转角度和偏移量,确定参考帧图像中的有效贴图区域,进一步保证了纹理贴图过程中语义特征的连续性,提高了第二关键帧图像的视觉效果。In this embodiment, the reference frame image is determined from multiple historical frame images, and an image that is similar to the first key frame image to a certain extent can be selected to ensure that the characteristics of the subsequent texture mapping of the blind area under the car are consistent. sex. According to the rotation angle and offset, the effective mapping area in the reference frame image is determined, which further ensures the continuity of semantic features in the texture mapping process and improves the visual effect of the second key frame image.

图3所示为本申请一示例性实施例提供的转化为局部语义地图的流程示意图。在图1所示实施例的基础上延伸出图3所示实施例,下面着重叙述图3所示实施例与图1所示实施例的不同之处,相同之处不再赘述。Figure 3 shows a schematic flowchart of conversion into a local semantic map provided by an exemplary embodiment of the present application. The embodiment shown in FIG. 3 is extended based on the embodiment shown in FIG. 1. The following focuses on the differences between the embodiment shown in FIG. 3 and the embodiment shown in FIG. 1, and the similarities will not be described again.

如图3所示,在本申请实施例中,基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图,包括如下步骤。As shown in Figure 3, in this embodiment of the present application, converting the second key frame image into a local semantic map based on the initial pose of the target vehicle includes the following steps.

步骤S310,对第二关键帧图像进行语义特征提取,得到第二关键帧图像中包含的目标对象的语义特征。Step S310: Perform semantic feature extraction on the second key frame image to obtain the semantic features of the target object contained in the second key frame image.

语义特征包括位置特征、方向特征和类别特征。示例性地,通过深度神经网络进行语义特征提取,包括车位线、车位角点、减速带、车道线、地面标识、建筑物、轮挡等语义实体(即目标对象)。进一步地,确定语义实体的属性信息,例如,属性信息可以表明语义实体的物理特性,或者,语义实体可能影响目标车辆自身移动的属性。例如,该属性信息可以是各个语义实体的位置、形状、尺寸、朝向等空间属性信息,也可以是各个语义实体的类别属性信息(诸如,每个语义实体究竟是可行道路、路沿、车道及车道线、交通标志、路面标志、红绿灯、停止线、人行横道、路边树木或柱子等中的哪一种)。根据第二关键帧图像来确定语义实体与目标车辆之间的相对位置关系,并根据局部位姿信息和相对位置关系来确定语义实体的空间属性信息。例如,该空间属性可以包括语义标志的尺寸、形状、朝向、高度、占据等各种与空间特性相关的属性。除了空间属性信息以外,例如,还可以根据第二关键帧图像,进一步确定出各个语义实体的类别。Semantic features include position features, direction features and category features. For example, deep neural networks are used to extract semantic features, including parking space lines, parking space corners, speed bumps, lane lines, ground markings, buildings, wheel blocks and other semantic entities (i.e. target objects). Further, the attribute information of the semantic entity is determined. For example, the attribute information may indicate the physical characteristics of the semantic entity, or the semantic entity may affect the attributes of the target vehicle's own movement. For example, the attribute information can be spatial attribute information such as position, shape, size, orientation, etc. of each semantic entity, or it can be category attribute information of each semantic entity (such as whether each semantic entity is a feasible road, curb, lane, etc. Which of lane markings, traffic signs, pavement markings, traffic lights, stop lines, crosswalks, roadside trees or pillars, etc.). The relative position relationship between the semantic entity and the target vehicle is determined based on the second key frame image, and the spatial attribute information of the semantic entity is determined based on the local pose information and the relative position relationship. For example, the spatial attributes may include various attributes related to spatial characteristics such as the size, shape, orientation, height, and occupation of the semantic mark. In addition to the spatial attribute information, for example, the category of each semantic entity can also be further determined based on the second key frame image.

步骤S320,确定目标车辆在当前时刻的位姿。Step S320: Determine the position and orientation of the target vehicle at the current moment.

在本实施例中,目标车辆在当前时刻的位姿等同于图1所示实施例中的局部位姿信息。In this embodiment, the pose of the target vehicle at the current moment is equivalent to the local pose information in the embodiment shown in FIG. 1 .

步骤S330,基于第二关键帧图像中包含的目标对象的语义特征、目标车辆在当前时刻的位姿和初始位姿,将第二关键帧图像转化为局部语义地图。Step S330: Convert the second key frame image into a local semantic map based on the semantic features of the target object contained in the second key frame image, the pose and initial pose of the target vehicle at the current moment.

在确定了第二关键帧图像包括的各个语义实体及其属性信息,就可以对这些信息进行综合,以构建基于第二关键帧图像的局部语义地图。即,将第二关键帧图像的语义标志结果进行重建并加入位置尺寸等属性,得到具有绝对属性的语义标志地图。After determining each semantic entity included in the second key frame image and its attribute information, the information can be synthesized to construct a local semantic map based on the second key frame image. That is, the semantic mark result of the second key frame image is reconstructed and attributes such as position and size are added to obtain a semantic mark map with absolute attributes.

在本实施例中,利用无车底盲区的第二关键帧图像获取局部语义地图,保证了局部语义地图中的语义特征的完整性、连续性和准确性,使得后续的建图的定位更加准确。In this embodiment, the second key frame image without the vehicle bottom blind spot is used to obtain the local semantic map, which ensures the integrity, continuity and accuracy of the semantic features in the local semantic map, making the subsequent mapping positioning more accurate. .

图4所示为本申请一示例性实施例提供的生成全局地图的流程示意图。在图1所示实施例的基础上延伸出图4所示实施例,下面着重叙述图4所示实施例与图1所示实施例的不同之处,相同之处不再赘述。Figure 4 shows a schematic flowchart of generating a global map provided by an exemplary embodiment of the present application. The embodiment shown in FIG. 4 is extended based on the embodiment shown in FIG. 1. The following focuses on the differences between the embodiment shown in FIG. 4 and the embodiment shown in FIG. 1, and the similarities will not be described again.

如图4所示,在本申请实施例中,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图,包括如下步骤。As shown in Figure 4, in this embodiment of the present application, the local semantic maps of the target vehicle at multiple moments in the target environment are spliced to generate a global map of the target environment, which includes the following steps.

步骤S410,在多个时刻的局部语义地图中确定至少两幅待匹配的局部语义地图。Step S410: Determine at least two local semantic maps to be matched among the local semantic maps at multiple times.

至少两幅待匹配的局部语义地图在时序上是连续的。At least two local semantic maps to be matched are sequential in time sequence.

步骤S420,分别在至少两幅待匹配的局部语义地图中确定目标点云和源点云。Step S420: Determine the target point cloud and the source point cloud in at least two local semantic maps to be matched.

若待匹配的局部语义地图为两幅,则目标点云和源点云分别是这两幅局部语义地图中的点云,例如,可以将包含源点云的局部语义地图作为基准,按照包含目标点云的局部语义地图的语义特征进行拼接。若存在大于两幅的局部语义地图,则可任选其中一幅局部语义地图,将其中的点云作为源点云,并将其余幅的待匹配局部语义地图分别和其进行拼接。If there are two local semantic maps to be matched, the target point cloud and the source point cloud are the point clouds in the two local semantic maps respectively. For example, the local semantic map containing the source point cloud can be used as the benchmark, and the target point cloud can be used as the reference point cloud. The semantic features of the local semantic maps of the point cloud are spliced. If there are more than two local semantic maps, you can select one of the local semantic maps, use the point cloud in it as the source point cloud, and splice the remaining local semantic maps to be matched with it.

步骤S430,基于目标点云的几何特征,确定目标点云中的每个点到源点云的最邻近点。Step S430: Based on the geometric characteristics of the target point cloud, determine the closest point from each point in the target point cloud to the source point cloud.

步骤S440,基于目标点云中的每个点到源点云的最邻近点,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。Step S440: Based on the nearest point from each point in the target point cloud to the source point cloud, the local semantic maps of the target vehicle at multiple moments in the target environment are spliced to generate a global map of the target environment.

将源点云记为P,目标点云记为Q,在Q中的每个点云寻找P中对应的最邻近的点云,形成匹配点对。将所有匹配点对的欧氏距离之和作为待求解的目标函数,利用奇异值分解求出旋转矩阵R和平移矩阵t以使目标函数最小。根据R和t将Q进行转换(包括旋转和平移)得到新的Q,并再次找到对应的点对,如此迭代,直至误差最小。利用误差最小时求得的旋转矩阵R和平移矩阵t,对局部语义地图进行拼接,生成目标环境的全局地图。The source point cloud is recorded as P and the target point cloud is recorded as Q. For each point cloud in Q, the corresponding nearest point cloud in P is found to form a matching point pair. The sum of the Euclidean distances of all matching point pairs is used as the objective function to be solved, and the rotation matrix R and the translation matrix t are obtained using singular value decomposition to minimize the objective function. Transform Q (including rotation and translation) according to R and t to obtain a new Q , and find the corresponding point pair again, and so on until the error is minimal. Using the rotation matrix R and translation matrix t obtained when the error is minimized, the local semantic maps are spliced to generate a global map of the target environment.

在本实施例中,通过求取源点云和目标点云之间的匹配点对,基于匹配点对构造旋转矩阵和平移矩阵,并利用所求的旋转矩阵和平移矩阵,将源点云变换到目标点云的坐标系下。估计变换后源点云与目标点云的误差函数,若误差函数值大于阈值,则继续迭代,直至根据旋转矩阵和平移矩阵进行拼接后的误差值满足给定的误差要求。本实施例中的方法简单,且具有较好的精度。In this embodiment, by obtaining the matching point pairs between the source point cloud and the target point cloud, a rotation matrix and a translation matrix are constructed based on the matching point pairs, and the source point cloud is transformed using the obtained rotation matrix and translation matrix. to the coordinate system of the target point cloud. Estimate the error function of the transformed source point cloud and target point cloud. If the error function value is greater than the threshold, continue iteration until the error value after splicing based on the rotation matrix and translation matrix meets the given error requirement. The method in this embodiment is simple and has good accuracy.

上文结合图1至图4,详细描述了本申请的基于视觉语义的建图方法实施例,下面结合图5,详细描述本申请的基于视觉语义的建图装置实施例。应理解,基于视觉语义的建图方法实施例的描述与基于视觉语义的建图装置实施例的描述相互对应,因此,未详细描述的部分可以参见前面方法实施例。The above describes in detail the embodiment of the mapping method based on visual semantics of the present application with reference to Figures 1 to 4. The following describes the embodiment of the mapping device based on visual semantics of the present application in detail with reference to Figure 5. It should be understood that the description of the embodiments of the mapping method based on visual semantics corresponds to the description of the embodiments of the mapping device based on visual semantics. Therefore, the parts not described in detail can be referred to the previous method embodiments.

图5所示为本申请一示例性实施例提供的基于视觉语义的建图装置的结构示意图。如图5所示,本申请实施例提供的基于视觉语义的建图装置50包括:Figure 5 shows a schematic structural diagram of a mapping device based on visual semantics provided by an exemplary embodiment of the present application. As shown in Figure 5, the visual semantics-based mapping device 50 provided by the embodiment of the present application includes:

获取模块510,用于获取目标车辆在目标环境中采集的当前时刻的多路环视图像;The acquisition module 510 is used to acquire the multi-channel surround image of the target vehicle at the current moment collected in the target environment;

第一拼接模块520,用于对多路环视图像进行鸟瞰拼接,生成第一关键帧图像,第一关键帧图像中包括车底盲区;The first splicing module 520 is used to perform bird's-eye splicing of multi-channel surround-view images to generate a first key frame image, where the first key frame image includes the blind spot under the vehicle;

确定模块530,用于确定第一关键帧图像对应的无车底盲区的多幅历史帧图像;The determination module 530 is used to determine multiple historical frame images without vehicle bottom blind spots corresponding to the first key frame image;

贴图模块540,用于基于多幅历史帧图像,对第一关键帧图像中的车底盲区进行纹理贴图,得到无车底盲区的第二关键帧图像;The mapping module 540 is used to perform texture mapping on the vehicle bottom blind area in the first key frame image based on multiple historical frame images, and obtain a second key frame image without the vehicle bottom blind area;

转化模块550,用于基于目标车辆的初始位姿,将第二关键帧图像转化为局部语义地图;The conversion module 550 is used to convert the second key frame image into a local semantic map based on the initial pose of the target vehicle;

第二拼接模块560,用于对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。The second splicing module 560 is used to splice local semantic maps of the target vehicle at multiple moments in the target environment to generate a global map of the target environment.

在本申请一实施例中,贴图模块540还用于,在多幅历史帧图像中确定用于贴图第一关键帧图像中的车底盲区的参考帧图像;确定目标车辆在当前时刻的轮速脉冲计数值、目标车辆在参考帧图像对应的历史时刻的轮速脉冲计数值;基于目标车辆在当前时刻的轮速脉冲计数值、以及历史时刻的轮速脉冲计数值,确定目标车辆在第一关键帧图像和参考帧图像中的旋转角度和偏移量;基于旋转角度和偏移量,在参考帧图像中确定有效贴图区域;基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到第二关键帧图像。In an embodiment of the present application, the mapping module 540 is also used to determine a reference frame image for mapping the blind spot under the vehicle in the first key frame image among multiple historical frame images; and determine the wheel speed of the target vehicle at the current moment. The pulse count value and the wheel speed pulse count value of the target vehicle at the historical moment corresponding to the reference frame image; based on the wheel speed pulse count value of the target vehicle at the current moment and the wheel speed pulse count value at the historical moment, it is determined that the target vehicle is in the first The rotation angle and offset in the key frame image and the reference frame image; based on the rotation angle and offset, determine the effective mapping area in the reference frame image; based on the effective mapping area, determine the blind area under the car in the first key frame image Perform texture mapping to obtain the second keyframe image.

在本申请一实施例中,贴图模块540还用于,基于目标车辆的行驶参数,分别确定多幅历史帧图像与第一关键帧图像的变换矩阵;基于目标采样误差,以第一关键帧图像中的特征信息为标准,对多幅历史帧图像各自对应的特征信息进行随机采样,确定多幅历史帧图像各自对应的采样值,采样值表征历史帧图像中的特征信息与第一关键帧图像中的特征信息的匹配度;基于多幅历史帧图像与第一关键帧图像的变换矩阵、多幅历史帧图像各自对应的采样值,从多幅历史帧图像中确定参考帧图像。In an embodiment of the present application, the mapping module 540 is also used to determine the transformation matrices of multiple historical frame images and the first key frame image based on the driving parameters of the target vehicle; based on the target sampling error, use the first key frame image to The feature information in the image is used as the standard. The feature information corresponding to multiple historical frame images is randomly sampled to determine the corresponding sampling values of the multiple historical frame images. The sampling values represent the feature information in the historical frame image and the first key frame image. The matching degree of the feature information in the image; based on the transformation matrix of the multiple historical frame images and the first key frame image, and the corresponding sampling values of the multiple historical frame images, the reference frame image is determined from the multiple historical frame images.

在本申请一实施例中,贴图模块540还用于,确定车底盲区的轮廓线;沿着轮廓线,向远离车底盲区的方向外扩目标距离得到外边线,将轮廓线和外边线组成的区域作为过渡区域;基于有效贴图区域,对第一关键帧图像中的车底盲区进行纹理贴图,得到待校准图像;确定待校准图像中的有效贴图区域的灰度梯度值;基于灰度梯度值,对待校准图像中的过渡区域的灰度值进行校准,得到第二关键帧图像。In one embodiment of the present application, the mapping module 540 is also used to determine the contour line of the blind area under the vehicle; along the contour line, expand the target distance away from the blind area under the vehicle to obtain the outer edge line, and combine the contour line and the outer edge line area as the transition area; based on the effective mapping area, texture map the blind area under the car in the first key frame image to obtain the image to be calibrated; determine the gray gradient value of the effective mapping area in the image to be calibrated; based on the gray gradient value, calibrate the gray value of the transition area in the image to be calibrated, and obtain the second key frame image.

在本申请一实施例中,转化模块550还用于,对第二关键帧图像进行语义特征提取,得到第二关键帧图像中包含的目标对象的语义特征,语义特征包括位置特征、方向特征和类别特征;确定目标车辆在当前时刻的位姿;基于第二关键帧图像中包含的目标对象的语义特征、目标车辆在当前时刻的位姿和初始位姿,将第二关键帧图像转化为局部语义地图。In an embodiment of the present application, the conversion module 550 is also used to extract semantic features from the second key frame image to obtain the semantic features of the target object contained in the second key frame image. The semantic features include position features, direction features and Category features; determine the pose of the target vehicle at the current moment; convert the second key frame image into a local Semantic map.

在本申请一实施例中,第二拼接模块560还用于,在多个时刻的局部语义地图中确定至少两幅待匹配的局部语义地图;分别在至少两幅待匹配的局部语义地图中确定目标点云和源点云;基于目标点云的几何特征,确定目标点云中的每个点到源点云的最邻近点;基于目标点云中的每个点到源点云的最邻近点,对目标车辆在目标环境中的多个时刻的局部语义地图进行拼接,生成目标环境的全局地图。In an embodiment of the present application, the second splicing module 560 is also used to determine at least two local semantic maps to be matched among the local semantic maps at multiple moments; Target point cloud and source point cloud; based on the geometric characteristics of the target point cloud, determine the closest point from each point in the target point cloud to the source point cloud; based on the closest point from each point in the target point cloud to the source point cloud Point, the local semantic maps of the target vehicle at multiple moments in the target environment are spliced to generate a global map of the target environment.

在本申请一实施例中,还包括定位模块,定位模块用于,基于目标车辆的初始位姿,对全局地图中的特征进行匹配,确定目标车辆的当前位姿。In an embodiment of the present application, a positioning module is also included. The positioning module is used to match features in the global map based on the initial pose of the target vehicle and determine the current pose of the target vehicle.

下面,参考图6来描述根据本申请实施例的电子设备。图6所示为本申请一示例性实施例提供的电子设备的结构示意图。Next, an electronic device according to an embodiment of the present application is described with reference to FIG. 6 . Figure 6 shows a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present application.

如图6所示,电子设备60包括一个或多个处理器601和存储器602。As shown in FIG. 6 , electronic device 60 includes one or more processors 601 and memory 602 .

处理器601可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备60中的其他组件以执行期望的功能。The processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 60 to perform desired functions.

存储器602可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器601可以运行所述程序指令,以实现上文所述的本申请的各个实施例的方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如包括多路环视图像、第一关键帧图像、第二关键帧图像、多幅历史帧图像、局部语义地图、全局地图等各种内容。Memory 602 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache). The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 601 may execute the program instructions to implement the methods of various embodiments of the application described above and/or other desired Function. Various contents such as multi-channel surround images, first key frame images, second key frame images, multiple historical frame images, local semantic maps, global maps, etc. can also be stored in the computer-readable storage medium.

在一个示例中,电子设备60还可以包括:输入装置603和输出装置604,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。In one example, the electronic device 60 may also include an input device 603 and an output device 604, and these components are interconnected through a bus system and/or other forms of connection mechanisms (not shown).

该输入装置603可以包括例如键盘、鼠标等等。The input device 603 may include, for example, a keyboard, a mouse, and the like.

该输出装置604可以向外部输出各种信息,包括多路环视图像、第一关键帧图像、第二关键帧图像、多幅历史帧图像、局部语义地图、全局地图等。该输出装置604可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 604 can output various information to the outside, including multi-channel surround images, first key frame images, second key frame images, multiple historical frame images, local semantic maps, global maps, etc. The output device 604 may include, for example, a display, a speaker, a printer, a communication network and its connected remote output devices, and the like.

当然,为了简化,图6中仅示出了该电子设备60中与本申请有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备60还可以包括任何其他适当的组件。Of course, for simplicity, only some of the components in the electronic device 60 related to the present application are shown in FIG. 6 , and components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 60 may also include any other suitable components depending on the specific application.

除了上述方法和设备以外,本申请的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述描述的根据本申请各种实施例的方法中的步骤。In addition to the above-mentioned methods and devices, embodiments of the present application may also be computer program products, which include computer program instructions. When the computer program instructions are run by a processor, the computer program instructions cause the processor to execute the above-described method of the present application according to the present specification. Steps in the methods of various embodiments.

所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本申请实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。The computer program product can be used to write program codes for performing the operations of the embodiments of the present application in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as the "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.

此外,本申请的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述描述的根据本申请各种实施例的方法中的步骤。In addition, embodiments of the present application may also be a computer-readable storage medium on which computer program instructions are stored. When the computer program instructions are run by a processor, the computer program instructions cause the processor to perform the steps described above in this specification according to the present application. Steps in the method of an embodiment.

所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The computer-readable storage medium may be any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

以上结合具体实施例描述了本申请的基本原理,但是,需要指出的是,在本申请中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本申请的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本申请为必须采用上述具体的细节来实现。The basic principles of the present application have been described above in conjunction with specific embodiments. However, it should be pointed out that the advantages, advantages, effects, etc. mentioned in this application are only examples and not limitations. These advantages, advantages, effects, etc. cannot be considered to be Each embodiment of this application must have. In addition, the specific details disclosed above are only for the purpose of illustration and to facilitate understanding, and are not limiting. The above details do not limit the application to be implemented using the above specific details.

本申请中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“诸如但不限于”,且可与其互换使用。The block diagrams of the devices, devices, equipment, and systems involved in this application are only illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, devices, equipment, and systems may be connected, arranged, and configured in any manner. Words such as "includes," "includes," "having," etc. are open-ended terms that mean "including, but not limited to," and may be used interchangeably therewith. As used herein, the words "or" and "and" refer to the words "and/or" and are used interchangeably therewith unless the context clearly dictates otherwise. As used herein, the word "such as" refers to the phrase "such as, but not limited to," and may be used interchangeably therewith.

还需要指出的是,在本申请的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本申请的等效方案。It should also be pointed out that in the device, equipment and method of the present application, each component or each step can be decomposed and/or recombined. These decompositions and/or recombinations shall be considered equivalent versions of this application.

提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本申请。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本申请的范围。因此,本申请不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, this application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本申请的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。The foregoing description has been presented for the purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the present application to the form disclosed herein. Although various example aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, changes, additions and sub-combinations thereof.

Claims (10)

1. A visual semantic based mapping method, comprising:
acquiring a plurality of paths of looking-around images of a target vehicle at the current moment acquired in a target environment;
Performing aerial view stitching on the multipath surrounding images to generate a first key frame image, wherein the first key frame image comprises a vehicle bottom blind area;
determining a plurality of historical frame images without a vehicle bottom blind area corresponding to the first key frame image;
based on the plurality of historical frame images, performing texture mapping on the blind areas of the vehicle bottom in the first key frame image to obtain a second key frame image without the blind areas of the vehicle bottom;
converting the second key frame image into a local semantic map based on the initial pose of the target vehicle;
and splicing the local semantic maps of the target vehicle at a plurality of moments in the target environment to generate a global map of the target environment.
2. The method of claim 1, wherein the performing texture mapping on the blind areas of the vehicle bottom in the first key frame image based on the plurality of historical frame images to obtain a second key frame image without blind areas of the vehicle bottom comprises:
determining a reference frame image for mapping a blind area of the vehicle bottom in the first key frame image in the plurality of historical frame images;
determining a wheel speed pulse count value of the target vehicle at the current moment and a wheel speed pulse count value of the target vehicle at a historical moment corresponding to the reference frame image;
Determining a rotation angle and an offset of the target vehicle in the first key frame image and the reference frame image based on a wheel speed pulse count value of the target vehicle at the current moment and a wheel speed pulse count value of the historical moment;
determining an effective map area in the reference frame image based on the rotation angle and the offset;
and carrying out texture mapping on the blind areas of the vehicle bottom in the first key frame image based on the effective mapping area to obtain the second key frame image.
3. The method of claim 2, wherein the determining a reference frame image from the plurality of historical frame images for mapping the under-floor dead zone in the first key frame image comprises:
based on the running parameters of the target vehicle, respectively determining transformation matrixes of the plurality of historical frame images and the first key frame image;
based on a target sampling error, taking characteristic information in the first key frame image as a standard, randomly sampling the characteristic information corresponding to each of the plurality of historical frame images, and determining sampling values corresponding to each of the plurality of historical frame images, wherein the sampling values represent the matching degree of the characteristic information in the historical frame image and the characteristic information in the first key frame image;
And determining the reference frame image from the plurality of historical frame images based on the transformation matrix of the plurality of historical frame images and the first key frame image and sampling values corresponding to the plurality of historical frame images respectively.
4. The method of claim 2, wherein the texture mapping the bottom dead zone in the first key frame image based on the valid mapping region to obtain the second key frame image comprises:
determining the contour line of the blind area of the vehicle bottom;
along the contour line, expanding the target distance outwards in a direction away from the blind area of the vehicle bottom to obtain an outer edge line, and taking a region formed by the contour line and the outer edge line as a transition region;
based on the effective mapping area, performing texture mapping on the vehicle bottom blind area in the first key frame image to obtain an image to be calibrated;
determining a gray gradient value of an effective mapping area in the image to be calibrated;
and calibrating the gray level value of the transition region in the image to be calibrated based on the gray level gradient value to obtain the second key frame image.
5. The method of any one of claims 1 to 4, wherein the converting the second keyframe image into a local semantic map based on the initial pose of the target vehicle comprises:
Extracting semantic features of the second key frame image to obtain semantic features of a target object contained in the second key frame image, wherein the semantic features comprise position features, direction features and category features;
determining the pose of the target vehicle at the current moment;
and converting the second key frame image into the local semantic map based on semantic features of a target object contained in the second key frame image, and the pose and initial pose of the target vehicle at the current moment.
6. The method of any one of claims 1 to 4, wherein the stitching together the local semantic maps of the target vehicle at a plurality of moments in the target environment to generate a global map of the target environment comprises:
determining at least two local semantic maps to be matched in the local semantic maps at a plurality of moments;
determining a target point cloud and a source point cloud in the at least two local semantic maps to be matched respectively;
determining, based on geometric features of the target point cloud, a nearest neighbor point of each point in the target point cloud to the source point cloud;
and based on each point in the target point cloud to the nearest point of the source point cloud, splicing local semantic maps of the target vehicle at a plurality of moments in the target environment, and generating a global map of the target environment.
7. The method according to any one of claims 1 to 4, further comprising:
and matching the features in the global map based on the initial pose of the target vehicle, and determining the current pose of the target vehicle.
8. A visual semantic based mapping device, comprising:
the acquisition module is used for acquiring a plurality of paths of looking-around images of the target vehicle at the current moment acquired in the target environment;
the first splicing module is used for performing aerial view splicing on the multipath surrounding images to generate a first key frame image, wherein the first key frame image comprises a vehicle bottom blind area;
the determining module is used for determining a plurality of historical frame images without a vehicle bottom blind area corresponding to the first key frame image;
the mapping module is used for performing texture mapping on the blind areas of the vehicle bottom in the first key frame image based on the historical frame images to obtain a second key frame image without the blind areas of the vehicle bottom;
the conversion module is used for converting the second key frame image into a local semantic map based on the initial pose of the target vehicle;
and the second splicing module is used for splicing the local semantic maps of the target vehicle at a plurality of moments in the target environment to generate a global map of the target environment.
9. A computer readable storage medium, characterized in that the storage medium stores a computer program for executing the method of any of the preceding claims 1 to 7.
10. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor being adapted to perform the method of any of the preceding claims 1 to 7.
CN202311245393.8A 2023-09-25 2023-09-25 Mapping methods, devices, storage media and electronic devices based on visual semantics Pending CN117351161A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311245393.8A CN117351161A (en) 2023-09-25 2023-09-25 Mapping methods, devices, storage media and electronic devices based on visual semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311245393.8A CN117351161A (en) 2023-09-25 2023-09-25 Mapping methods, devices, storage media and electronic devices based on visual semantics

Publications (1)

Publication Number Publication Date
CN117351161A true CN117351161A (en) 2024-01-05

Family

ID=89368253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311245393.8A Pending CN117351161A (en) 2023-09-25 2023-09-25 Mapping methods, devices, storage media and electronic devices based on visual semantics

Country Status (1)

Country Link
CN (1) CN117351161A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060022808A1 (en) * 2004-08-02 2006-02-02 Nissan Motor Co., Ltd. Drive sense adjusting apparatus and drive sense adjusting method
CN112348921A (en) * 2020-11-05 2021-02-09 上海汽车集团股份有限公司 Mapping method and system based on visual semantic point cloud
CN113228135A (en) * 2021-03-29 2021-08-06 华为技术有限公司 Blind area image acquisition method and related terminal device
CN113240756A (en) * 2021-07-13 2021-08-10 天津所托瑞安汽车科技有限公司 Pose change detection method and device for vehicle-mounted BSD camera and storage medium
WO2021156154A1 (en) * 2020-02-05 2021-08-12 Outsight System, method, and computer program product for avoiding ground blindness in a vehicle
CN116468607A (en) * 2023-04-21 2023-07-21 宣城立讯精密工业有限公司 Vehicle bottom blind area filling method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060022808A1 (en) * 2004-08-02 2006-02-02 Nissan Motor Co., Ltd. Drive sense adjusting apparatus and drive sense adjusting method
WO2021156154A1 (en) * 2020-02-05 2021-08-12 Outsight System, method, and computer program product for avoiding ground blindness in a vehicle
CN112348921A (en) * 2020-11-05 2021-02-09 上海汽车集团股份有限公司 Mapping method and system based on visual semantic point cloud
CN113228135A (en) * 2021-03-29 2021-08-06 华为技术有限公司 Blind area image acquisition method and related terminal device
WO2022204854A1 (en) * 2021-03-29 2022-10-06 华为技术有限公司 Method for acquiring blind zone image, and related terminal apparatus
CN113240756A (en) * 2021-07-13 2021-08-10 天津所托瑞安汽车科技有限公司 Pose change detection method and device for vehicle-mounted BSD camera and storage medium
CN116468607A (en) * 2023-04-21 2023-07-21 宣城立讯精密工业有限公司 Vehicle bottom blind area filling method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马芳武;史津竹;葛林鹤;代凯;仲首任;吴量;: "无人驾驶车辆单目视觉里程计的研究进展", 吉林大学学报(工学版), no. 03, 31 December 2020 (2020-12-31) *

Similar Documents

Publication Publication Date Title
KR102314228B1 (en) Map construction method, apparatus, device and readable storage medium
JP6745328B2 (en) Method and apparatus for recovering point cloud data
CN109461211B (en) Semantic vector map construction method and device based on visual point cloud and electronic equipment
US9443309B2 (en) System and method for image based mapping, localization, and pose correction of a vehicle with landmark transform estimation
CN106548486B (en) Unmanned vehicle position tracking method based on sparse visual feature map
CN113916243A (en) Vehicle positioning method, device, equipment and storage medium for target scene area
CN108802785A (en) Vehicle method for self-locating based on High-precision Vector map and monocular vision sensor
KR102200299B1 (en) A system implementing management solution of road facility based on 3D-VR multi-sensor system and a method thereof
WO2020043081A1 (en) Positioning technique
CN116698051B (en) High-precision vehicle positioning, vectorization map construction and positioning model training method
WO2023185354A1 (en) Real location navigation method and apparatus, and device, storage medium and program product
CN113240734B (en) Vehicle cross-position judging method, device, equipment and medium based on aerial view
WO2023283929A1 (en) Method and apparatus for calibrating external parameters of binocular camera
CN111381585B (en) A method for constructing an occupancy grid map, its device, and related equipment
CN112836698A (en) A positioning method, device, storage medium and electronic device
WO2022062480A1 (en) Positioning method and positioning apparatus of mobile device
CN115410167A (en) Target detection and semantic segmentation method, device, equipment and storage medium
Laflamme et al. Driving datasets literature review
CN116678424B (en) High-precision vehicle positioning, vectorized map construction and positioning model training method
CN117213515A (en) Visual SLAM path planning method and device, electronic equipment and storage medium
CN116295457B (en) Vehicle vision positioning method and system based on two-dimensional semantic map
CN111833443B (en) Landmark Position Reconstruction in Autonomous Machine Applications
US20210064872A1 (en) Object detecting system for detecting object by using hierarchical pyramid and object detecting method thereof
CN112767477A (en) Positioning method, positioning device, storage medium and electronic equipment
CN108416044B (en) Scene thumbnail generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination