CN117671644A

CN117671644A - Signboard detection method and device and vehicle

Info

Publication number: CN117671644A
Application number: CN202311760806.6A
Authority: CN
Inventors: 刘诚
Original assignee: Guangzhou Xiaopeng Autopilot Technology Co Ltd
Current assignee: Guangzhou Xiaopeng Autopilot Technology Co Ltd
Priority date: 2023-12-19
Filing date: 2023-12-19
Publication date: 2024-03-08
Also published as: WO2025130617A1

Abstract

The application discloses a signboard detection method, a signboard detection device and a vehicle, wherein the method comprises the following steps: acquiring target point cloud data corresponding to a target area through a laser radar of a vehicle, and acquiring camera visual data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard; performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; and obtaining the position of at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame. According to the method and the device, the position of the signboard included in the target area is obtained according to the three-dimensional point cloud data corresponding to the target area obtained through the laser radar and the two-dimensional camera visual data corresponding to the target area obtained through the camera, so that the accuracy of determining the position of the signboard is improved.

Description

Signboard detection method and device and vehicle

Technical Field

The application relates to the technical field of automobiles, and in particular relates to a signboard detection method and device and a vehicle.

Background

With the development of science and technology and the improvement of the physical living standard of people, the use of vehicles by people is more and more common, and the functions of the vehicles are more and more. In the related art, traffic sign detection belongs to an important technical link in automatic driving, and the accuracy requirement of people on sign detection is higher and higher.

Disclosure of Invention

The application provides a signboard detection method and device and a vehicle, so as to improve the problems.

In a first aspect, an embodiment of the present application provides a method for detecting a signboard, which is applied to a vehicle, and the method includes: acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle, and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard; performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; and obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame.

In a second aspect, an embodiment of the present application further provides a signboard detection device, applied to a vehicle, including: the system comprises a target point cloud data acquisition module, a three-dimensional bounding box acquisition module, a two-dimensional visual detection box acquisition module and a signboard position acquisition module. The target point cloud data acquisition module is used for acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard; the three-dimensional bounding box obtaining module is used for carrying out point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; the two-dimensional visual detection frame acquisition module is used for acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; the signboard position obtaining module is used for obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame.

In a third aspect, embodiments of the present application further provide a vehicle, including: one or more processors, memory, and one or more applications. Wherein one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more program configurations being executed to implement the method as described in the first aspect above.

In a fourth aspect, embodiments of the present application also provide a computer readable storage medium having program code stored therein, the program code being executable by a processor to perform the method according to the first aspect.

According to the technical scheme, target point cloud data corresponding to a target area are obtained through a laser radar of a vehicle, and camera vision data corresponding to the target area are obtained through a camera of the vehicle, wherein the target area comprises at least one signboard; performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; according to the three-dimensional surrounding frame and the two-dimensional visual detection frame, the position of at least one signboard included in the target area is obtained, so that the position of the signboard included in the target area is obtained according to the three-dimensional point cloud data corresponding to the target area obtained through the laser radar and the two-dimensional camera visual data corresponding to the target area obtained through the camera, the signboard is detected in a mode of combining the laser radar and the vision and the accuracy of determining the position of the signboard is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of acquiring at least one three-dimensional bounding box according to an embodiment of the present application;

fig. 4 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

fig. 5 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

fig. 7 is a schematic flow chart of a method for detecting a signboard according to an embodiment of the present application;

FIG. 8 shows a block diagram of a signboard detection apparatus according to an embodiment of the present application;

FIG. 9 shows a block diagram of a vehicle for performing a signboard detection method according to an embodiment of the present application;

fig. 10 shows a storage unit for storing or carrying program codes for implementing the signboard detection method according to the embodiment of the present application.

Detailed Description

In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application.

With the development of scientific technology, the automatic driving technology of vehicles has been developed. The detection of the signboard belongs to an important link in the automatic driving technology, so that the requirement of people on the detection of the signboard is higher and higher.

Because of the small size of the sign, there is a certain difficulty in determining the position of the sign. In the related art, a scheme of projecting a visual two-dimensional detection frame to a three-dimensional scene to determine a signboard position and a signboard position determination scheme based on deep learning of a laser radar are proposed. The scheme of projecting the visual two-dimensional detection frame to the three-dimensional scene to determine the position of the signboard has extremely high requirements on whether the detected scene is horizontal or not and the two-dimensional detection quality; the semantic part of the traffic sign is on the non-ground, and the precondition of two-dimensional distance detection is that the detected object is on the ground, so that the real sign position cannot be obtained by using a visual projection method. The pure deep learning scheme based on the laser radar needs to use a stronger graphic processor, so that the method has higher requirements on deployed hardware and marked data quantity and has higher development cost; in addition, due to the segmentation scheme based on the laser radar, the category information of the signboard cannot be given, and detection omission is easily caused by undersegmentation.

Therefore, in the related art, there is a problem that the position detection of the signboard is not high in accuracy.

In order to solve the problems, the inventor finds out through long-term research and proposes the signboard detection method, the signboard detection device and the vehicle, which are provided by the embodiment of the application, through obtaining the position of the signboard included in the target area according to the three-dimensional point cloud data corresponding to the target area obtained through the laser radar and the two-dimensional camera vision data corresponding to the target area obtained through the camera, the signboard is detected in a mode of combining the laser radar and the vision and crossing the sensor, and the accuracy of determining the position of the signboard is improved.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a flow chart illustrating a method for detecting a signboard according to an embodiment of the present application. In a specific embodiment, the signboard detecting method can be applied to the signboard detecting apparatus 200 shown in fig. 8 and the vehicle 100 (fig. 9) equipped with the signboard detecting apparatus 200. The specific flow of the present embodiment will be described below by taking a vehicle as an example, and it will be understood that the vehicle to which the present embodiment is applied may include an electronic device having processing capability, such as an electric vehicle or a gasoline vehicle, and is not limited herein. The following details about the flow shown in fig. 1, the method for detecting a signboard specifically includes the following steps:

Step S110: and acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle, and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard.

In some embodiments, a lidar may be included in the vehicle, e.g., mechanically scanned, semi-solid, etc., type lidar. The vehicle can acquire the point cloud data of the environment where the vehicle is located in real time through the laser radar.

In some embodiments, one or more cameras may be included in the vehicle. The vehicle can acquire camera visual data of the environment where the vehicle is located in real time through the camera. Wherein the camera visual data may comprise one or more images.

The target area may include at least one signboard, wherein the target area may be understood as a driving environment of the vehicle, and the target area may be understood as an irradiation range of a laser radar of the vehicle. Wherein, the driving environment can comprise at least one signboard. The signboard may include a ban signboard, a warning signboard, an indication signboard, etc., which is not limited herein.

In some embodiments, the vehicle may receive a user-entered control instruction, which may be used to instruct the vehicle to automatically drive. Accordingly, the vehicle can respond to the control instruction, acquire target point cloud data corresponding to the target area through the laser radar of the vehicle, and acquire camera vision data corresponding to the target area through the camera of the vehicle.

In some embodiments, in consideration of improving accuracy of pose acquisition of a target area including an object by a vehicle, in this embodiment, the vehicle may perform motion compensation on point cloud data of the target area acquired by a laser radar to obtain target point cloud data, so as to improve usability of the target point cloud data. Optionally, the laser radars included in the vehicles are different, and the corresponding vehicles do motion compensation on the point cloud data obtained by the laser radars to obtain the point cloud data in different modes.

The laser radar included in the vehicle is a solid-state laser radar, and accordingly, the vehicle may acquire first point cloud data of a target area acquired by the laser radar in a current frame and second point cloud data of the target area acquired by the laser radar in a previous frame of the current frame, and may perform motion compensation on a first pose of a corresponding object in the first point cloud data based on a second pose of the corresponding object in the second point cloud data, so as to obtain target point cloud data, so as to improve usability of the target point cloud data.

The laser radar included in the vehicle is a circle-turning type laser radar, and accordingly, the vehicle can acquire a plurality of point cloud data of a target area acquired by the laser radar in a preset number of frames, and can match the plurality of point cloud data to acquire target point cloud data, so that usability of the target point cloud data is improved.

Step S120: and performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data.

In some embodiments, after the vehicle obtains the target point cloud data corresponding to the target area, the point cloud segmentation may be performed on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data. The vehicle may be preset with a preset segmentation algorithm, for example, an algorithm based on plane fitting, an algorithm based on characteristics of laser point cloud data, a segmentation algorithm, a recursive greedy algorithm, a hungarian algorithm, and the like, which are not limited herein.

For example, the vehicle may segment the target point cloud data using a hungarian algorithm; if the vehicle considers the under-segmentation condition of the target point cloud data, the vehicle may segment the target point cloud data using a recursive greedy algorithm.

The vehicle may perform point cloud segmentation on the target point cloud data to obtain at least one irregularly-shaped three-dimensional bounding box corresponding to the target point cloud data.

Step S130: and acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data.

In some embodiments, after the vehicle obtains the camera vision data corresponding to the target area, at least one two-dimensional vision detection frame corresponding to the camera vision data may be obtained. Wherein the camera visual data may comprise at least one image. Wherein the at least one image may include at least one object therein; accordingly, the vehicle may perform motion compensation on an object included in the at least one image based on the at least one image to obtain the target image. Accordingly, the vehicle may perform target detection on at least one object included in the target image to obtain at least one two-dimensional visual detection frame.

The vehicle can process the camera visual data based on the target detection algorithm to obtain at least one two-dimensional visual detection frame corresponding to the camera visual data. The target detection algorithm may include a 2D sensing algorithm, a YOLO algorithm, etc., which are not limited herein. The at least one two-dimensional visual detection frame may include objects such as a signboard, a static obstacle, a dynamic obstacle, and the like, which are not limited herein. Wherein dynamic obstacles include, but are not limited to, motor vehicles, pedestrians, etc.; static obstacles include, but are not limited to, lane lines, parking spaces, travelable areas, general obstacles, and the like.

Optionally, the vehicle may perform signboard detection on the camera visual data and obtain at least one corresponding two-dimensional visual detection frame; wherein, each two-dimensional visual detection frame can comprise a signboard. The vehicle can acquire the geographic position of the target area, and according to the geographic position, the vehicle carries out signboard detection on the camera vision data, and at least one corresponding two-dimensional vision detection frame is acquired. Wherein, it can be understood that the shape of the identification plates of different countries can be the same or different; the vehicle can determine the country where the signboard is located according to the geographic position of the target area, and further determine the shape of the signboard corresponding to the country, so that the signboard is detected on the camera visual data, at least one corresponding two-dimensional visual detection frame is obtained, and accuracy of visual signboard detection is improved.

The vehicle can also acquire the weather condition of the target area, and according to the weather condition, the vehicle can detect the identification plate of the camera vision data, and at least one corresponding two-dimensional vision detection frame is obtained. Wherein, it can be understood that the reflective colors of the signboards in different weather can be the same or different; the vehicle can determine the weather of the signboard according to the weather condition of the target area, and further determine the reflective color of the signboard corresponding to the weather, so that the signboard is detected on the camera visual data, at least one corresponding two-dimensional visual detection frame is obtained, and the accuracy of visual signboard detection is improved.

Step S140: and obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame.

In some embodiments, after the vehicle obtains at least one three-dimensional bounding box corresponding to the target area and at least one two-dimensional visual detection box corresponding to the target area, a position of at least one signboard included in the target area may be obtained according to the at least one three-dimensional bounding box and the at least one two-dimensional visual detection box.

Wherein the vehicle may project at least one three-dimensional bounding box corresponding to the target region into the two-dimensional space and correspond to the at least one two-dimensional visual inspection box. Optionally, the vehicle may determine a three-dimensional bounding box in the two-dimensional space, where the three-dimensional bounding box has an overlapping area with the two-dimensional visual detection box, as the box to be verified; the vehicle may also determine a three-dimensional bounding box in the two-dimensional space that is within a preset distance range from the position of the two-dimensional visual detection box as the box to be verified.

Correspondingly, the vehicle can detect the attribute of the object corresponding to the two-dimensional visual detection frame corresponding to the frame to be verified, if the attribute of the object is detected to be a signboard, the position of the frame to be verified can be obtained, and the position of the frame to be verified can be determined to be the position of the signboard. The position of the frame to be verified can be obtained through the position corresponding to the three-dimensional surrounding frame corresponding to the frame to be verified.

In some embodiments, the number of three-dimensional bounding boxes corresponding to the box to be verified may be one or more. Optionally, when the number of three-dimensional bounding boxes corresponding to the to-be-verified box is one, the position of the to-be-verified box may be the position of the three-dimensional bounding box; when the number of three-dimensional bounding boxes corresponding to the frame to be verified is a plurality of, the position of the frame to be verified may be the position of any one of the three-dimensional bounding boxes, for example, the position of the three-dimensional bounding box with the largest projected area in the two-dimensional space, or the position of the three-dimensional bounding box with the smallest projected area in the two-dimensional space. When the number of the three-dimensional bounding boxes corresponding to the to-be-verified frame is multiple, the vehicle can fuse the multiple three-dimensional bounding boxes to obtain a fused three-dimensional bounding box, and accordingly, the vehicle can determine the position of the fused three-dimensional bounding box as the position of the to-be-verified frame.

In some embodiments, the vehicle may also obtain the type of the signboard, such as prohibiting passage, prohibiting parking, etc., in the case where the attribute of the object is determined to be the signboard. Accordingly, the vehicle can obtain the type of the signboard under the condition of obtaining the position of the signboard, and output the position and the type of the signboard, so that the vehicle can plan a driving route based on the position and the type of the signboard, and the rationality and the safety of automatic driving or auxiliary driving of the vehicle are improved, and the experience of a user is improved.

In some embodiments, referring to fig. 2, the method for detecting a signboard according to an embodiment of the present application may further include steps S141 to S143 before step S140.

Step S141: and detecting the obstacle of the target point cloud data to obtain an obstacle frame corresponding to the target point cloud data.

In some embodiments, in order to reduce the calculated amount of the vehicle detection signboard, after the vehicle obtains the target point cloud data, obstacle detection may be performed on the target point cloud data to obtain an obstacle frame corresponding to the target point cloud data, and the vehicle filters the target point cloud data based on the obstacle frame, so as to reduce the data amount of the point cloud data, reduce the calculated amount of the position of the vehicle detection signboard in the target area, and reduce the power consumption of the vehicle detection signboard.

The vehicle can detect the obstacle in the target point cloud data based on a deep learning mode. The deep learning mode may include detecting an obstacle in the target point cloud data based on a convolutional neural network, a cyclic neural network, and the like, and obtaining an obstacle frame corresponding to the target point cloud data.

The vehicle can perform preprocessing such as filtering on the target point cloud data and removing outliers at the distribution edge, and perform obstacle detection on the preprocessed target point cloud data based on the deep learning model to obtain an obstacle frame corresponding to the target point cloud data. The vehicle can perform ground point cloud segmentation on the target point cloud data, cluster clustering can be performed on the target point cloud data based on a clustering algorithm, a plurality of clusters are obtained, and each cluster can be used for representing one obstacle. Correspondingly, the vehicle can perform bounding box fitting on each cluster to obtain an obstacle frame corresponding to the cloud data of the target point.

Step S142: and obtaining an obstacle fusion result corresponding to the target point cloud data according to the obstacle frame, the at least one three-dimensional bounding box and the camera vision data.

In some embodiments, after the vehicle obtains the obstacle frame corresponding to the target point cloud data, the at least one bounding box corresponding to the target point cloud data, and the camera vision data corresponding to the target area, an obstacle fusion result corresponding to the target point cloud data may be obtained according to the obstacle frame, the at least one three-dimensional bounding box, and the camera vision data.

The vehicle may project at least one three-dimensional bounding box corresponding to the target point cloud data obtained by the segmentation method and the obstacle box obtained by the deep learning method into a two-dimensional space to obtain a two-dimensional box to be fused. Accordingly, the vehicle can fuse the two-dimensional frame to be fused with the camera vision data to obtain an obstacle fusion result corresponding to the target point cloud data.

It can be appreciated that in this embodiment, the three-dimensional data of the target point cloud data is determined by projecting the 3D point cloud data into a two-dimensional space and fusing the two-dimensional camera vision data. In addition, the vehicle may obtain depth information of an object included in the point cloud data from the three-dimensional point cloud data. Based on the method, the vehicle obtains the obstacle fusion result corresponding to the target point cloud data, then filters out the point cloud data corresponding to the obstacle in the target area included in the target point cloud data from the target point cloud data, reduces the calculated amount of the vehicle for detecting the target point cloud data and the data amount corresponding to the target point cloud data, saves the storage space of the vehicle, and improves the accuracy of detecting the signboard of the vehicle.

Step S143: and filtering the at least one three-dimensional bounding box according to the obstacle fusion result to obtain the filtered at least one three-dimensional bounding box.

In some embodiments, after the vehicle obtains the obstacle fusion result corresponding to the target point cloud data, at least one three-dimensional bounding box corresponding to the target area may be filtered according to the obstacle fusion result, and the filtered at least one three-dimensional bounding box is obtained.

In some implementations, the vehicle may acquire ground point cloud data included in the target point cloud data; the ground point cloud data is understood to be the point cloud data returned when the laser radar irradiates the ground. Accordingly, after the vehicle obtains the ground point cloud data included in the target point cloud data, the vehicle can filter the obstacle fusion result in the target point cloud data from at least one three-dimensional bounding box by combining the ground point cloud data to obtain at least one three-dimensional bounding box after filtering so as to improve the speed and accuracy of vehicle signboard detection.

In some embodiments, the vehicle may acquire ground point cloud data included in the target point cloud data in a process of acquiring an obstacle fusion result corresponding to the target point cloud data according to the obstacle frame, the at least one three-dimensional bounding box, and the camera vision data. Accordingly, after the vehicle obtains the ground point cloud data included in the target point cloud data, the point cloud data which does not belong to the obstacle fusion result in the target point cloud data can be combined with the ground point cloud data according to the obstacle fusion result and the ground point cloud data, so that the primary screening point cloud data is obtained.

Accordingly, the vehicle may project the primary screening point cloud data (including the ground point cloud data and the point cloud data not belonging to the obstacle fusion result) included in the target point cloud data to a two-dimensional space to obtain a two-dimensional frame to be verified, and may match the frame to be verified with at least one visual detection frame corresponding to the camera visual data to obtain the position and the kind of at least one signboard included in the target area.

For example, please refer to fig. 3, which is a schematic diagram illustrating a process of acquiring at least one three-dimensional bounding box according to an embodiment of the present application. The vehicle can acquire the point cloud data of the target area acquired in real time through the laser radar, and can acquire target point cloud data based on the relative pose of the object included in the target area in the point cloud data of the previous frame as motion compensation to the pose of the object included in the target area in the point cloud data of the current frame, so as to improve the usability of the point cloud data.

After the vehicle obtains the target point cloud data, obstacle detection can be performed on the target point cloud data based on a preset deep learning algorithm, so that an obstacle frame corresponding to the target point cloud data is obtained. After the vehicle obtains the target point cloud data, the vehicle may obtain the point cloud data returned by irradiating the laser radar onto the ground, that is, obtain the ground point cloud data in the target point cloud data.

The vehicle can also divide the target point cloud data through a division algorithm to obtain at least one three-dimensional bounding box corresponding to the target point cloud data.

The vehicle can acquire camera visual data of a target area acquired through the camera in real time, and can acquire at least one two-dimensional visual detection frame corresponding to the camera visual data. The vehicle can also project an obstacle frame obtained by deep learning and at least one three-dimensional bounding box obtained by dividing target point cloud data into two-dimensional space and match and fuse the two-dimensional bounding box with at least one two-dimensional visual detection frame to obtain an obstacle fusion result corresponding to the target point cloud data, and can filter the at least one three-dimensional bounding box according to the obstacle fusion result to obtain the filtered at least one three-dimensional bounding box.

In some embodiments, referring to fig. 4, the method for detecting a signboard according to an embodiment of the present application may further include step S144-step S145 after step S140.

Step S144: and acquiring the target two-dimensional visual detection frames corresponding to the at least one signboard respectively.

In some embodiments, the vehicle may acquire a two-dimensional visual detection frame corresponding to each of the at least one signboard after acquiring the position of the at least one signboard included in the target area, and may determine the two-dimensional visual detection frame as the target two-dimensional visual detection frame.

Step S145: and obtaining the respective corresponding types of the at least one signboard according to the camera vision data corresponding to the target two-dimensional vision detection frame.

In some embodiments, after the vehicle obtains the target two-dimensional visual detection frame corresponding to each of the at least one signboard, the type corresponding to each of the at least one signboard may be obtained according to the camera visual data corresponding to the target two-dimensional visual detection frame. The vehicle can detect the type of the identification plate included in the target two-dimensional visual detection frame based on the camera visual data corresponding to the target two-dimensional visual detection frame through an image recognition algorithm.

In some embodiments, after obtaining the camera visual data, the vehicle may process the camera visual data based on an image recognition algorithm to obtain a camera signboard perception, that is, at least one signboard included in the target area.

Accordingly, after the vehicle determines to obtain the respective positions of at least one signboard included in the target area, the vehicle can obtain the types of the signboard included in the target two-dimensional visual detection frame detected in advance according to the target two-dimensional visual detection frame corresponding to the at least one signboard, so that the reliable position depth information of the signboard is obtained by utilizing the barrier result and the laser radar segmentation result of visual detection and through the mode of combining the directional laser radar with the vision, the cost of signboard detection is reduced, and the accuracy of signboard detection is improved.

According to the signboard detection method, target point cloud data corresponding to a target area are obtained through a laser radar of a vehicle, and camera vision data corresponding to the target area are obtained through a camera of the vehicle, wherein the target area comprises at least one signboard; performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; according to the method, the position of at least one signboard included in the target area is obtained according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame, so that the position of the signboard included in the target area is obtained according to the three-dimensional point cloud data corresponding to the target area obtained through the laser radar and the two-dimensional camera visual data corresponding to the target area obtained through the camera, the signboard is detected in a mode of combining the laser radar and the vision, and the accuracy of determining the position of the signboard is improved.

Referring to fig. 5, fig. 5 is a flow chart illustrating a method for detecting a signboard according to an embodiment of the present application. The method is applied to the vehicle, and will be described in detail with respect to the flow shown in fig. 5, and the method for detecting the signboard specifically includes the following steps:

Step S210: and acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle, and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard.

Step S220: and performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data.

Step S230: and acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data.

For a specific description of step S210 to step S230, please refer to the previous description of step S110 to step S130, and the detailed description is omitted herein.

Step S240: and projecting the at least one three-dimensional bounding box into a two-dimensional space to obtain two-dimensional boxes corresponding to the at least one three-dimensional bounding box respectively.

In some embodiments, after the vehicle obtains at least one three-dimensional bounding box corresponding to the target point cloud data, the at least one three-dimensional bounding box may be projected into a two-dimensional space, and two-dimensional boxes corresponding to the at least one three-dimensional bounding box respectively are obtained. The vehicle may project the at least one three-dimensional bounding box into a camera coordinate system based on a camera coordinate system corresponding to a camera of the vehicle, and obtain two-dimensional boxes corresponding to the at least one three-dimensional bounding box.

Optionally, the vehicle may project a point included in the three-dimensional bounding box to a two-dimensional space to obtain a two-dimensional box corresponding to the three-dimensional bounding box; the vehicle can project the points of the boundary frames corresponding to the three-dimensional bounding frames to a two-dimensional space to obtain two-dimensional frames corresponding to the three-dimensional bounding frames; the vehicle can also project the vertex of the bounding box corresponding to the three-dimensional bounding box to a two-dimensional space to obtain a two-dimensional box corresponding to the three-dimensional bounding box.

Step S250: and matching the two-dimensional frames corresponding to the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame to obtain a matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box.

In some embodiments, after the vehicle obtains the two-dimensional frames corresponding to the at least one three-dimensional bounding box, the two-dimensional frames corresponding to the at least one three-dimensional bounding box are matched with the at least one two-dimensional visual detection frame, and the matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box is obtained.

The vehicle can match at least one two-dimensional visual detection frame in the camera coordinate system with the two-dimensional frame corresponding to the at least one three-dimensional surrounding frame obtained by projecting the at least one three-dimensional surrounding frame to the camera coordinate system, and the position of at least one signboard in a target area obtained by the vehicle in a sensor-crossing mode (combining a three-dimensional object detected by the laser radar and the two-dimensional frame detected by the visual detection) improves the accuracy of the position detection of the signboard.

In some embodiments, the vehicle matches the two-dimensional frames corresponding to each of the at least one three-dimensional bounding box with the at least one two-dimensional visual inspection box, and the process of obtaining the matching relationship of the at least one two-dimensional visual inspection box with the at least one three-dimensional bounding box may include obtaining an overlapping region of the two-dimensional frames corresponding to each of the at least one two-dimensional visual inspection box and the at least one three-dimensional bounding box, and determining the matching relationship of the at least one two-dimensional visual inspection box with the at least one three-dimensional bounding box based on the overlapping region.

The vehicle can acquire the area size of the overlapping area of the two-dimensional frames corresponding to the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame with the overlapping area. If the area is larger than the overlapping area threshold value, the matching relationship between the two-dimensional visual detection frame with the overlapping area and the three-dimensional surrounding frame can be determined. Accordingly, the vehicle may acquire the areas of the overlapping areas of the two-dimensional visual inspection frame and the two-dimensional frames corresponding to the at least one three-dimensional bounding box, and determine the matching relationship between the two-dimensional visual inspection frame and the at least one three-dimensional bounding box according to the number of overlapping areas with the areas greater than the overlapping area threshold.

In some embodiments, considering that the area of the signboard is smaller, the vehicle may acquire the area of the two-dimensional frame corresponding to each of the at least one three-dimensional bounding box, and match the two-dimensional frame corresponding to each of the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame based on a recursive matching manner according to the order of the areas of the two-dimensional frames from smaller to larger, so as to obtain the matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box.

The vehicle may also obtain a radius of a two-dimensional frame corresponding to each of the at least one three-dimensional bounding box, and match the two-dimensional frame corresponding to each of the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame based on a recursive matching manner according to a sequence from the smaller radius to the larger radius of the two-dimensional frame, so as to obtain a matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box.

As an embodiment, the process of determining the matching relationship between the at least one two-dimensional visual inspection frame and the at least one three-dimensional bounding frame based on the overlapping region by the vehicle may include determining that the matching relationship between the fourth three-dimensional bounding frame and the fourth two-dimensional visual inspection frame is a one-to-one relationship if it is determined that there is a region overlapping between the fourth three-dimensional bounding frame of the at least one three-dimensional bounding frame and the fourth two-dimensional visual inspection frame of the at least one two-dimensional visual inspection based on the overlapping region.

As one embodiment, the process of determining the matching relationship of the at least one two-dimensional visual inspection frame and the at least one three-dimensional bounding frame based on the overlapping region by the vehicle may include determining that the matching relationship of the plurality of fifth three-dimensional bounding frames and the fifth two-dimensional visual inspection frame in the at least one two-dimensional visual inspection is a many-to-one relationship if the plurality of fifth three-dimensional bounding frames in the at least one three-dimensional bounding frame are determined to overlap with the fifth two-dimensional visual inspection frame in the at least one two-dimensional visual inspection based on the overlapping region.

As one embodiment, the process of determining the matching relationship of the at least one two-dimensional visual inspection frame and the at least one three-dimensional bounding frame based on the overlapping region by the vehicle may include determining that the sixth three-dimensional bounding frame does not match the sixth two-dimensional visual inspection frame if it is determined that the sixth three-dimensional bounding frame of the at least one three-dimensional bounding frame overlaps with the sixth two-dimensional visual inspection frame non-existence region of the at least one three-dimensional bounding frame based on the overlapping region.

As one embodiment, the process of determining the matching relationship of the at least one two-dimensional visual inspection frame and the at least one three-dimensional bounding box based on the overlapping region by the vehicle may include determining that the matching relationship of the seventh three-dimensional bounding box and the seventh two-dimensional visual inspection frames including the signboard is a one-to-many relationship if it is determined that there is an overlapping region between the seventh two-dimensional visual inspection frames including the signboard and the seventh three-dimensional bounding box of the at least one three-dimensional bounding box in the at least one two-dimensional visual inspection frame based on the overlapping region.

Step S260: and obtaining the position of the at least one signboard included in the target area based on the matching relation.

As an implementation manner, after the vehicle obtains the matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame, the position of the at least one signboard included in the target area may be obtained based on the matching relationship. If the vehicle detects that the matching relationship is a one-to-one relationship, a three-dimensional bounding box which is in a one-to-one relationship with the at least one two-dimensional visual detection frame may be determined from at least one three-dimensional bounding box as a first three-dimensional bounding box, and a two-dimensional visual detection frame which is in a one-to-one relationship with the first three-dimensional bounding box may be determined as a first two-dimensional visual detection frame. If the vehicle detects that the first two-dimensional visual detection frame comprises the signboard, a first position corresponding to the first three-dimensional surrounding frame can be obtained, and the first position can be determined to be the position of the signboard comprised by the first two-dimensional visual detection frame.

In this embodiment, the vehicle may determine, from the at least one three-dimensional bounding box, a three-dimensional bounding box that has a one-to-one relationship with the at least one two-dimensional visual detection box as the frame to be detected, in consideration of the small area of the signboard. If the vehicle detects that the area of the frame to be detected is smaller than the area threshold, the frame to be detected can be determined to be a first three-dimensional bounding frame.

As an embodiment, if the vehicle detects that the matching relationship is a many-to-one relationship, a plurality of three-dimensional bounding boxes in a many-to-one relationship with one of the at least one two-dimensional visual inspection boxes may be determined from the at least one three-dimensional bounding box, and the plurality of three-dimensional bounding boxes may be subjected to fusion processing to obtain a second three-dimensional bounding box, and a two-dimensional visual inspection box in a many-to-one relationship with the second three-dimensional bounding box from the at least one two-dimensional visual inspection box may be determined as the second two-dimensional visual inspection box. If the vehicle detects that the second two-dimensional visual detection frame comprises the signboard, a second position corresponding to the second three-dimensional surrounding frame can be obtained, and the second position is determined to be the position of the signboard comprised by the second two-dimensional visual detection frame.

The process of the vehicle fusing the plurality of three-dimensional bounding boxes to obtain the second three-dimensional bounding box may include performing expansion processing on the plurality of three-dimensional bounding boxes until the plurality of three-dimensional bounding boxes are tangent to each other, and obtaining one three-dimensional bounding box formed after the expansion processing on the plurality of three-dimensional bounding boxes as the second three-dimensional bounding box. Alternatively, the vehicle may determine a three-dimensional bounding box having the largest area of a corresponding two-dimensional box of the plurality of three-dimensional bounding boxes as the second three-dimensional bounding box; the vehicle may further determine a three-dimensional bounding box having a smallest area of a corresponding two-dimensional box of the plurality of three-dimensional bounding boxes as the second three-dimensional bounding box.

As an embodiment, if the vehicle detects that the matching relationship is a one-to-many relationship, a third three-dimensional bounding box that is in a one-to-many relationship with a plurality of third two-dimensional visual inspection boxes of the at least one two-dimensional visual inspection box may be determined from the at least one three-dimensional bounding box, where each third two-dimensional visual inspection box includes a signboard. Accordingly, the vehicle may divide the third three-dimensional bounding box based on the plurality of third two-dimensional visual inspection boxes, obtain a target third three-dimensional bounding box corresponding to each of the plurality of third two-dimensional visual inspection boxes, and may obtain a third position of the target third three-dimensional bounding box, and use the third position as a position of a signboard included in the plurality of third two-dimensional visual inspection boxes. Wherein the corresponding plurality of third two-dimensional visual inspection boxes can be understood as under-segmented objects.

In some embodiments, considering that the amount of calculation for determining the positions of the identification plates based on the one-to-one relationship is small, in this embodiment, the vehicle may preferentially perform the position detection of the identification plates on the matched pair of the three-dimensional bounding box and the two-dimensional visual detection frame whose matched relationship is the one-to-one relationship. If the vehicle detects that the two-dimensional visual detection frame in the matching pair comprises the signboard, the position corresponding to the three-dimensional surrounding frame in the matching pair can be obtained, and the position is determined to be the position of the signboard corresponding to the matching pair. In addition, considering that the area of the signboard is smaller, the vehicle can obtain a two-dimensional visual detection frame matched with the three-dimensional surrounding frame based on the sequence from small to large of the area of the two-dimensional frame corresponding to the three-dimensional surrounding frame, and the position detection of the signboard is carried out, so that the efficiency and the accuracy of the position detection of the signboard are improved.

In some embodiments, referring to fig. 6, the method for detecting a signboard according to an embodiment of the present application may further include steps S261 to S262 after step S260.

Step S261: and if the two-dimensional visual detection frame which is not matched with the at least one three-dimensional bounding box exists in the at least one two-dimensional visual detection frame, acquiring ground point cloud data corresponding to the target point cloud data.

In some embodiments, if the vehicle detects that the two-dimensional visual detection frame which is not matched with the at least one three-dimensional bounding frame exists in the at least one two-dimensional visual detection frame, the vehicle can acquire the ground point cloud data corresponding to the target point cloud data. The vehicle can perform ground point clustering on the target point cloud data to obtain the ground point cloud data.

Step S262: and if the unmatched two-dimensional visual detection frame is detected to be matched with the ground point cloud data, filtering the unmatched two-dimensional visual detection frame from the at least one two-dimensional visual detection frame to obtain at least one updated two-dimensional visual detection frame.

In some embodiments, if the vehicle detects that the two-dimensional visual inspection frame is not matched with the ground point cloud data, the two-dimensional visual inspection frame may be filtered from the at least one two-dimensional visual inspection frame to obtain the updated at least one two-dimensional visual inspection frame, so as to increase a rate of obtaining a matching relationship between the at least one two-dimensional visual inspection frame and the at least one three-dimensional bounding frame.

The vehicle can calculate the area of the overlapping area of the two-dimensional visual detection frame and the ground point cloud data, compare the area with a ground matching area threshold, and determine that the two-dimensional visual detection frame and the ground point cloud data are matched if the area is greater than or equal to the ground matching area threshold. Accordingly, the vehicle may determine the unmatched two-dimensional visual detection frame that matches the ground point cloud data as a false-detection two-dimensional visual detection frame; accordingly, the vehicle may delete the unmatched two-dimensional visual inspection box that matches the ground point cloud data to save storage space of the vehicle.

For example, please refer to fig. 7, which illustrates a flowchart of a method for detecting a signboard according to an embodiment of the present application. The vehicle can obtain ground point cloud data according to the target point cloud data, and can understand that the used laser radar ground points are not marked; the vehicle can also obtain at least one three-dimensional bounding box after being segmented by the laser radar, and can fuse the obstacle frame obtained by the deep learning target point cloud data, the at least one three-dimensional bounding box and the camera vision data to obtain an obstacle fusion result. The vehicle can also detect the traffic sign board according to the camera visual data to obtain the visual traffic sign board of the target area.

The vehicle may combine the point cloud data, which does not belong to the obstacle fusion result, in the target point cloud data determined after the deep learning with the ground point cloud data based on the ground point cloud data corresponding to the target point cloud data and the obstacle fusion result corresponding to the target area, obtain primary screening point cloud data, and obtain at least one three-dimensional bounding box according to the primary screening point cloud data.

In the case where the vehicle does not acquire the obstacle frame based on the deep learning algorithm, the vehicle may project at least one three-dimensional bounding box corresponding to the target point cloud data into the two-dimensional space. Correspondingly, the vehicle can also match the two-dimensional frames corresponding to the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame to obtain the matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box.

The vehicle can detect the position of the license plate by matching the three-dimensional surrounding frame with the two-dimensional visual detection frame, wherein the matching relationship of the three-dimensional surrounding frame and the two-dimensional visual detection frame is one-to-one. If the vehicle detects that the two-dimensional visual detection frame in the matching pair comprises the identification plate according to the visual traffic identification plate of the target area, the position corresponding to the three-dimensional surrounding frame in the matching pair can be obtained, and the position is determined to be the position of the identification plate corresponding to the matching pair.

The vehicle can recursively match the three-dimensional bounding box with at least one two-dimensional visual detection frame according to the order from small radius to large radius based on the radius of the two-dimensional frame corresponding to the three-dimensional bounding box to obtain a matching relation, and the position of at least one signboard included in the target area is obtained based on the matching relation.

If the vehicle detects that the matching relationship is a one-to-many relationship, a third three-dimensional surrounding frame which is in a one-to-many relationship with a plurality of third two-dimensional visual detection frames in the at least one two-dimensional visual detection frame can be determined from the at least one three-dimensional surrounding frame, wherein each third two-dimensional visual detection frame comprises a signboard. Accordingly, the vehicle may divide the third three-dimensional bounding box based on the plurality of third two-dimensional visual inspection boxes, obtain a target third three-dimensional bounding box corresponding to each of the plurality of third two-dimensional visual inspection boxes, and may obtain a third position of the target third three-dimensional bounding box, and use the third position as a position of a signboard included in the plurality of third two-dimensional visual inspection boxes, so as to process the under-divided object.

The vehicle can also project the ground point cloud data into a two-dimensional space, and can acquire a two-dimensional visual detection frame which is not matched with the at least one three-dimensional bounding box in the at least one two-dimensional visual detection frame. The vehicle can cluster the ground points of the laser radar in the two-dimensional space, can match the clustered ground points with the unmatched two-dimensional visual detection frames, can determine that the unmatched two-dimensional visual detection frames are false detection visual frames if the unmatched two-dimensional visual detection frames are detected to be matched with the clustered ground points, can filter the unmatched two-dimensional visual detection frames so as to save the storage space of the vehicle, improve the rate of determining that at least one three-dimensional surrounding frame is matched with the updated two-dimensional visual detection frames, and improve the rate of determining the position of the signboard.

The vehicle can acquire the target two-dimensional visual detection frames corresponding to the at least one signboard, and can acquire the types corresponding to the at least one signboard according to the camera visual data corresponding to the target two-dimensional visual detection frames.

The vehicle can obtain the position and the type of the traffic sign board in the irradiation range of the laser radar sensor by utilizing the internal parameters and the external parameters of the camera and the laser radar under the vehicle body coordinate system and through projection and multistage matching, so that the position of the sign board is obtained through a multistage filtering method under the condition of less calculation amount and less scene dependence in the sign board detection method combining the laser radar and vision and the false detection rate of the sign board.

Compared with the signboard detection method shown in fig. 1, the signboard detection method provided by the embodiment of the present application may further project at least one three-dimensional bounding box into a two-dimensional space to obtain two-dimensional frames corresponding to the at least one three-dimensional bounding box; matching the two-dimensional frames corresponding to the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame to obtain a matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box; the position of at least one signboard included in the target area is obtained based on the matching relation, and the position of the signboard is detected under the conditions of smaller scene dependence and less calculation by carrying out multistage matching and filtering on the two-dimensional visual detection frame and the three-dimensional surrounding frame, so that the detection accuracy of the signboard is improved, and the power consumption and the cost of the signboard detection are reduced.

Referring to fig. 8, which shows a signboard detecting apparatus according to an embodiment of the present application, the signboard detecting apparatus 200 includes: the target point cloud data obtaining module 210, the three-dimensional bounding box obtaining module 220, the two-dimensional visual detection box obtaining module and the signboard position obtaining module 240, wherein:

the target point cloud data acquisition module 210 is configured to acquire target point cloud data corresponding to a target area through a laser radar of the vehicle, and acquire camera vision data corresponding to the target area through a camera of the vehicle, where the target area includes at least one signboard.

The three-dimensional bounding box obtaining module 220 is configured to perform point cloud segmentation on the target point cloud data, and obtain at least one three-dimensional bounding box corresponding to the target point cloud data.

The two-dimensional visual detection frame acquisition module 230 is configured to acquire at least one two-dimensional visual detection frame corresponding to the camera visual data.

The sign board position obtaining module 240 is configured to obtain a position of the at least one sign board included in the target area according to the at least one three-dimensional bounding box and the at least one two-dimensional visual detection box.

Further, the signboard position obtaining module 240 may include: the three-dimensional bounding box projection unit, the three-dimensional bounding box and two-dimensional visual detection frame matching unit and the position obtaining subunit of the signboard, wherein:

and the three-dimensional bounding box projection unit is used for projecting the at least one three-dimensional bounding box into a two-dimensional space to obtain two-dimensional boxes corresponding to the at least one three-dimensional bounding box respectively.

And the matching unit is used for matching the two-dimensional frames corresponding to the at least one three-dimensional surrounding frame with the at least one two-dimensional visual detection frame, and obtaining the matching relation between the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame.

And a position obtaining subunit for obtaining the position of the at least one signboard included in the target area based on the matching relationship.

Further, the position obtaining subunit of the signboard may include: a one-to-one relationship three-dimensional bounding box and two-dimensional visual detection box determining unit and a first position obtaining unit, wherein:

and the one-to-one relationship three-dimensional bounding box and two-dimensional visual detection frame determining unit is used for determining a three-dimensional bounding box which is in one-to-one relationship with the at least one two-dimensional visual detection frame from the at least one three-dimensional bounding box as a first three-dimensional bounding box and determining a two-dimensional visual detection frame which is in one-to-one relationship with the first three-dimensional bounding box as a first two-dimensional visual detection frame if the matching relationship is in one-to-one relationship.

And the first position obtaining unit is used for obtaining a first position corresponding to the first three-dimensional surrounding frame if the first two-dimensional visual detection frame is detected to comprise the signboard, and determining the first position as the position of the signboard comprised by the first two-dimensional visual detection frame.

Further, the one-to-one relationship three-dimensional bounding box and two-dimensional visual detection box determination unit may include: the device comprises a to-be-detected frame determining unit and a first three-dimensional bounding box determining unit, wherein:

and the to-be-detected frame determining unit is used for determining a three-dimensional bounding box which is in one-to-one relation with the at least one two-dimensional visual detection frame from the at least one three-dimensional bounding box as the to-be-detected frame.

And the first three-dimensional bounding box determining unit is used for determining the frame to be detected as the first three-dimensional bounding box if the area of the frame to be detected is smaller than an area threshold value.

Further, the position obtaining subunit of the signboard may include: a plurality of three-dimensional bounding box determining units of many-to-one relations, a plurality of three-dimensional bounding box fusing units, a second two-dimensional visual detection box determining unit of many-to-one relations, and a second position obtaining unit, wherein:

And the multi-to-one relation multi-three-dimensional bounding box determining unit is used for determining a plurality of three-dimensional bounding boxes which are in a multi-to-one relation with one two-dimensional visual detection frame in the at least one two-dimensional visual detection frame from the at least one three-dimensional bounding box if the matching relation is the multi-to-one relation.

And the three-dimensional bounding box fusion units are used for carrying out fusion processing on the three-dimensional bounding boxes to obtain a second three-dimensional bounding box.

And the second two-dimensional visual detection frame determining unit is used for determining a two-dimensional visual detection frame which is in a many-to-one relationship with the second three-dimensional surrounding frame in the at least one two-dimensional visual detection frame as a second two-dimensional visual detection frame.

And the second position obtaining unit is used for obtaining a second position corresponding to the second three-dimensional surrounding frame if the second two-dimensional visual detection frame is detected to comprise the signboard, and determining the second position as the position of the signboard comprised by the second two-dimensional visual detection frame.

Further, the position obtaining subunit of the signboard may include: a third three-dimensional bounding box determination unit, a third three-dimensional bounding box segmentation unit, and a third position obtaining unit of a one-to-many relationship, wherein:

And a third three-dimensional bounding box determining unit for determining a third three-dimensional bounding box in a one-to-many relationship with a plurality of third two-dimensional visual detection boxes in the at least one two-dimensional visual detection box from the at least one three-dimensional bounding box if the matching relationship is in the one-to-many relationship, wherein each third two-dimensional visual detection box comprises a signboard.

And the third three-dimensional bounding box segmentation unit is used for segmenting the third three-dimensional bounding boxes based on the plurality of third two-dimensional visual detection frames to obtain target third three-dimensional bounding boxes corresponding to the plurality of third two-dimensional visual detection frames.

And a third position obtaining unit, configured to obtain a third position of the target third three-dimensional bounding box, and use the third position as a position of a signboard included in the plurality of third two-dimensional visual detection boxes.

Further, after the obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional bounding box and the at least one two-dimensional visual inspection box, the signboard inspection apparatus 200 may further include: the system comprises a ground point cloud data acquisition unit and at least one two-dimensional visual detection frame updating unit, wherein:

The ground point cloud data acquisition unit is used for acquiring the ground point cloud data corresponding to the target point cloud data if the two-dimensional visual detection frame which is not matched with the at least one three-dimensional bounding box exists in the at least one two-dimensional visual detection frame.

And the updating unit of the at least one two-dimensional visual detection frame is used for filtering the unmatched two-dimensional visual detection frame from the at least one two-dimensional visual detection frame to obtain the updated at least one two-dimensional visual detection frame if the unmatched two-dimensional visual detection frame is detected to be matched with the ground point cloud data.

Further, the three-dimensional bounding box and two-dimensional visual inspection box matching unit may include: the device comprises an overlapping region acquisition unit and a matching relation determination unit, wherein:

and the overlapping region acquisition unit is used for acquiring the overlapping region of the two-dimensional frames corresponding to the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame respectively.

And the matching relation determining unit is used for determining the matching relation between the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame based on the overlapping area.

Further, the matching relation determination unit may include: a one-to-one relationship determination unit, and/or a many-to-one relationship determination unit, and/or a mismatch determination unit, and/or a one-to-many relationship determination unit, wherein:

And the one-to-one relation determining unit is used for determining that the matching relation between the fourth three-dimensional bounding box and the fourth two-dimensional visual detection frame is a one-to-one relation if the fourth three-dimensional bounding box in the at least one three-dimensional bounding box is determined to be overlapped with the fourth two-dimensional visual detection frame in the at least one two-dimensional visual detection based on the overlapped area.

And the many-to-one relation determining unit is used for determining that the matching relation between the plurality of fifth three-dimensional bounding boxes and the fifth two-dimensional visual detection box is a many-to-one relation if the plurality of fifth three-dimensional bounding boxes in the at least one three-dimensional bounding box are determined to overlap with the existence area of the fifth two-dimensional visual detection box in the at least one two-dimensional visual detection based on the overlapping area.

And a mismatch determination unit configured to determine that a sixth three-dimensional bounding box of the at least one three-dimensional bounding box does not match with a sixth two-dimensional visual detection frame of the at least one three-dimensional bounding box if it is determined that the sixth three-dimensional bounding box overlaps with a sixth two-dimensional visual detection frame non-existence region of the at least one three-dimensional bounding box based on the overlapping region.

And the one-to-many relation determining unit is used for determining that the matching relation between the seventh three-dimensional surrounding frame and the plurality of seventh two-dimensional visual detection frames comprising the identification plate is one-to-many relation if the overlapping area exists between the plurality of seventh two-dimensional visual detection frames comprising the identification plate and the seventh three-dimensional surrounding frame in the at least one three-dimensional surrounding frame based on the overlapping area.

Further, before the obtaining the position of the signboard included in the target area according to the at least one three-dimensional bounding box and the at least one two-dimensional visual detection box, the signboard detecting apparatus 200 may further include: the device comprises an obstacle frame obtaining unit, an obstacle fusion result obtaining unit and at least one three-dimensional bounding box filtering unit, wherein:

and the obstacle frame obtaining unit is used for carrying out obstacle detection on the target point cloud data to obtain an obstacle frame corresponding to the target point cloud data.

And the obstacle fusion result obtaining unit is used for obtaining an obstacle fusion result corresponding to the target point cloud data according to the obstacle frame, the at least one three-dimensional bounding box and the camera vision data.

And the at least one three-dimensional bounding box filtering unit is used for filtering the at least one three-dimensional bounding box according to the obstacle fusion result to obtain the filtered at least one three-dimensional bounding box.

Further, after the obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional bounding box and the at least one two-dimensional visual inspection box, the signboard inspection apparatus 200 may further include: the type acquisition unit of target two-dimensional visual detection frame acquisition unit and signboard, wherein:

And the target two-dimensional visual detection frame acquisition unit is used for acquiring the target two-dimensional visual detection frames corresponding to the at least one signboard.

The type acquisition unit is used for acquiring the type corresponding to each of the at least one signboard according to the camera vision data corresponding to the target two-dimensional vision detection frame.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.

In several embodiments provided herein, the coupling of the modules to each other may be electrical, mechanical, or other.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

Referring to fig. 9, a block diagram of a vehicle according to an embodiment of the present application is shown. The vehicle 100 may be an electronic device having processing capability such as an electric vehicle or a gasoline vehicle. The vehicle 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more application programs, wherein the one or more application programs may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more program(s) configured to perform the method as described in the foregoing method embodiments.

Processor 110 may include one or more processing cores. The processor 110 utilizes various interfaces and lines to connect various portions of the overall vehicle 100, perform various functions of the vehicle 100 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and invoking data stored in the memory 120. Alternatively, the processor 110 may be implemented in at least one hardware form of Digital Signal ProceSSing (DSP), field-Programmable Gate Array (FPGA), and programmable logic array (Programmable Logic Array, PLA). The processor 110 may integrate one or a combination of several of a central processing unit (Central ProceSSing Unit, CPU), a graphics processor (Graphics ProceSSing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing the content to be displayed; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 110 and may be implemented solely by a single communication chip.

The Memory 120 may include a random access Memory (Random AcceSS Memory, RAM) or a Read-Only Memory (Read-Only Memory). Memory 120 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described below, etc. The storage data area may also store data created by the vehicle 100 in use (e.g., phonebook, audio-video data, chat-record data), and the like.

Referring to fig. 10, a block diagram of a computer readable storage medium according to an embodiment of the present application is shown. The computer readable storage medium 300 has stored therein program code that can be invoked by a processor to perform the methods described in the method embodiments described above.

The computer readable storage medium 300 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer readable storage medium 300 comprises a non-volatile computer readable medium (non-transitory computer-readable storage medium). The computer readable storage medium 300 has storage space for program code 310 that performs any of the method steps described above. The program code can be read from or written to one or more computer program products. Program code 310 may be compressed, for example, in a suitable form.

In summary, according to the method, the device and the vehicle for detecting the signboard, provided by the embodiment of the application, target point cloud data corresponding to a target area are obtained through a laser radar of the vehicle, and camera visual data corresponding to the target area is obtained through a camera of the vehicle, wherein the target area comprises at least one signboard; performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data; acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data; according to the three-dimensional surrounding frame and the two-dimensional visual detection frame, the position of at least one signboard included in the target area is obtained, so that the position of the signboard included in the target area is obtained according to the three-dimensional point cloud data corresponding to the target area obtained through the laser radar and the two-dimensional camera visual data corresponding to the target area obtained through the camera, the signboard is detected in a mode of combining the laser radar and the vision, and the accuracy of determining the position of the signboard is improved.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, one of ordinary skill in the art will appreciate that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not drive the essence of the corresponding technical solutions to depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of signboard detection, characterized by being applied to a vehicle, the method comprising:

acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle, and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard;

performing point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data;

acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data;

and obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame.

2. The method of claim 1, wherein said obtaining the location of the at least one signboard included in the target area based on the at least one three-dimensional bounding box and the at least one two-dimensional visual inspection box comprises:

projecting the at least one three-dimensional bounding box into a two-dimensional space to obtain two-dimensional boxes corresponding to the at least one three-dimensional bounding box respectively;

Matching the two-dimensional frames corresponding to the at least one three-dimensional bounding box with the at least one two-dimensional visual detection frame to obtain a matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding box;

and obtaining the position of the at least one signboard included in the target area based on the matching relation.

3. The method of claim 2, wherein the obtaining the location of the at least one signboard included in the target area based on the matching relationship comprises:

if the matching relationship is a one-to-one relationship, determining a three-dimensional bounding box which is in a one-to-one relationship with the at least one two-dimensional visual detection frame from the at least one three-dimensional bounding box as a first three-dimensional bounding box, and determining a two-dimensional visual detection frame which is in a one-to-one relationship with the first three-dimensional bounding box as a first two-dimensional visual detection frame;

if the first two-dimensional visual detection frame comprises the signboard, a first position corresponding to the first three-dimensional surrounding frame is obtained, and the first position is determined to be the position of the signboard comprised by the first two-dimensional visual detection frame.

4. A method according to claim 3, wherein said determining a three-dimensional bounding box from said at least one three-dimensional bounding box in a one-to-one relationship with said at least one two-dimensional visual inspection box as a first three-dimensional bounding box comprises:

Determining a three-dimensional bounding box which is in one-to-one relation with the at least one two-dimensional visual detection frame from the at least one three-dimensional bounding box as a frame to be detected;

and if the area of the frame to be detected is smaller than an area threshold value, determining the frame to be detected as the first three-dimensional bounding box.

5. The method of claim 2, wherein the obtaining the location of the at least one signboard included in the target area based on the matching relationship comprises:

if the matching relationship is a many-to-one relationship, determining a plurality of three-dimensional bounding boxes which are in a many-to-one relationship with one two-dimensional visual detection box in the at least one two-dimensional visual detection box from the at least one three-dimensional bounding box;

performing fusion processing on the plurality of three-dimensional bounding boxes to obtain a second three-dimensional bounding box;

determining a two-dimensional visual detection frame which is in a many-to-one relation with the second three-dimensional surrounding frame in the at least one two-dimensional visual detection frame as a second two-dimensional visual detection frame;

if the fact that the second two-dimensional visual detection frame comprises the signboard is detected, a second position corresponding to the second three-dimensional surrounding frame is obtained, and the second position is determined to be the position of the signboard contained in the second two-dimensional visual detection frame.

6. The method of claim 2, wherein the obtaining the location of the at least one signboard included in the target area based on the matching relationship comprises:

if the matching relationship is a one-to-many relationship, determining a third three-dimensional bounding box which is in a one-to-many relationship with a plurality of third two-dimensional visual detection boxes in the at least one two-dimensional visual detection box from the at least one three-dimensional bounding box, wherein each third two-dimensional visual detection box comprises a signboard;

dividing the third three-dimensional bounding box based on the plurality of third two-dimensional visual detection boxes to obtain target third three-dimensional bounding boxes corresponding to the plurality of third two-dimensional visual detection boxes respectively;

and acquiring a third position of the target third three-dimensional bounding box, and taking the third position as the positions of the identification plates included by the plurality of third two-dimensional visual detection boxes.

7. The method of claim 2, further comprising, after the obtaining the location of the at least one signboard included in the target area according to the at least one three-dimensional bounding box and the at least one two-dimensional visual inspection box:

If the two-dimensional visual detection frame which is not matched with the at least one three-dimensional bounding frame exists in the at least one two-dimensional visual detection frame, acquiring ground point cloud data corresponding to the target point cloud data;

and if the unmatched two-dimensional visual detection frame is detected to be matched with the ground point cloud data, filtering the unmatched two-dimensional visual detection frame from the at least one two-dimensional visual detection frame to obtain at least one updated two-dimensional visual detection frame.

8. The method according to claim 2, wherein said matching the two-dimensional frames corresponding to the at least one three-dimensional bounding box with the at least one two-dimensional visual inspection box, to obtain the matching relationship between the at least one two-dimensional visual inspection box and the at least one three-dimensional bounding box, includes:

acquiring overlapping areas of the two-dimensional frames corresponding to the at least one two-dimensional visual detection frame and the at least one three-dimensional surrounding frame respectively;

and determining a matching relationship between the at least one two-dimensional visual detection frame and the at least one three-dimensional bounding frame based on the overlapping region.

9. The method of claim 8, wherein the determining a matching relationship of the at least one two-dimensional visual inspection box and the at least one three-dimensional bounding box based on the overlap region comprises:

If the overlapping area is based on the fact that a fourth three-dimensional surrounding frame in the at least one three-dimensional surrounding frame is overlapped with a fourth two-dimensional visual detection frame in the at least one two-dimensional visual detection, determining that a matching relationship between the fourth three-dimensional surrounding frame and the fourth two-dimensional visual detection frame is a one-to-one relationship; and/or

If the overlapping area is used for determining that a plurality of fifth three-dimensional bounding boxes in the at least one three-dimensional bounding box overlap with a fifth two-dimensional visual detection box in the at least one two-dimensional visual detection, determining that the matching relationship between the plurality of fifth three-dimensional bounding boxes and the fifth two-dimensional visual detection box is a many-to-one relationship; and/or

If the overlapping area is determined to overlap with the non-existing area of the sixth three-dimensional surrounding frame of the at least one three-dimensional surrounding frame and the sixth two-dimensional visual detection frame of the at least one three-dimensional surrounding frame, determining that the sixth three-dimensional surrounding frame is not matched with the sixth two-dimensional visual detection frame; and/or

And if the overlapping area exists between a plurality of seventh two-dimensional visual detection frames comprising the identification plate and a seventh three-dimensional surrounding frame in the at least one three-dimensional surrounding frame based on the overlapping area, determining that the matching relationship between the seventh three-dimensional surrounding frame and the plurality of seventh two-dimensional visual detection frames comprising the identification plate is a one-to-many relationship.

10. The method according to any one of claims 1-9, further comprising, prior to said obtaining the position of the signboard comprised by the target area from said at least one three-dimensional bounding box and said at least one two-dimensional visual inspection box:

performing obstacle detection on the target point cloud data to obtain an obstacle frame corresponding to the target point cloud data;

obtaining an obstacle fusion result corresponding to the target point cloud data according to the obstacle frame, the at least one three-dimensional bounding box and the camera vision data;

and filtering the at least one three-dimensional bounding box according to the obstacle fusion result to obtain the filtered at least one three-dimensional bounding box.

11. The method according to any one of claims 1-9, further comprising, after said obtaining the position of said at least one signboard comprised by said target area from said at least one three-dimensional bounding box and said at least one two-dimensional visual inspection box:

acquiring target two-dimensional visual detection frames corresponding to the at least one signboard respectively;

and obtaining the respective corresponding types of the at least one signboard according to the camera vision data corresponding to the target two-dimensional vision detection frame.

12. A sign detection apparatus for use with a vehicle, the apparatus comprising:

the target point cloud data acquisition module is used for acquiring target point cloud data corresponding to a target area through a laser radar of the vehicle and acquiring camera vision data corresponding to the target area through a camera of the vehicle, wherein the target area comprises at least one signboard;

the three-dimensional bounding box obtaining module is used for carrying out point cloud segmentation on the target point cloud data to obtain at least one three-dimensional bounding box corresponding to the target point cloud data;

the two-dimensional visual detection frame acquisition module is used for acquiring at least one two-dimensional visual detection frame corresponding to the camera visual data;

the signboard position obtaining module is used for obtaining the position of the at least one signboard included in the target area according to the at least one three-dimensional surrounding frame and the at least one two-dimensional visual detection frame.

13. A vehicle, characterized by comprising:

one or more processors;

a memory;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-11.

14. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program code, which is callable by a processor for executing the method according to any one of claims 1-11.