[go: up one dir, main page]

CN113435318B - Neural network training, image detection, driving control method and device - Google Patents

Neural network training, image detection, driving control method and device Download PDF

Info

Publication number
CN113435318B
CN113435318B CN202110713234.0A CN202110713234A CN113435318B CN 113435318 B CN113435318 B CN 113435318B CN 202110713234 A CN202110713234 A CN 202110713234A CN 113435318 B CN113435318 B CN 113435318B
Authority
CN
China
Prior art keywords
vehicle
information
image
dimensional
visible
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110713234.0A
Other languages
Chinese (zh)
Other versions
CN113435318A (en
Inventor
李昂
蒋沁宏
石建萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Lingang Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority to CN202110713234.0A priority Critical patent/CN113435318B/en
Publication of CN113435318A publication Critical patent/CN113435318A/en
Application granted granted Critical
Publication of CN113435318B publication Critical patent/CN113435318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本公开提供了一种神经网络训练、图像检测、行驶控制方法、装置、电子设备及存储介质,该神经网络训练方法包括:获取样本图像,以及所述样本图像的二维标注数据;所述二维标注数据包括所述样本图像中目标车辆的检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息;基于所述样本图像和所述二维标注数据,训练包括多个分支网络的目标神经网络;其中,在将所述样本图像输入到所述目标神经网络后,每个分支网络分别输出以下信息中的一个:所述目标车辆的二维检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息。

The present disclosure provides a neural network training, image detection, driving control method, device, electronic device and storage medium. The neural network training method includes: obtaining a sample image and two-dimensional annotation data of the sample image; the two-dimensional annotation data includes detection frame information of a target vehicle in the sample image, and at least one attribute information capable of characterizing the three-dimensional posture of the target vehicle; based on the sample image and the two-dimensional annotation data, training a target neural network including multiple branch networks; wherein, after the sample image is input into the target neural network, each branch network outputs one of the following information respectively: two-dimensional detection frame information of the target vehicle, and at least one attribute information capable of characterizing the three-dimensional posture of the target vehicle.

Description

Neural network training, image detection and driving control method and device
Technical Field
The disclosure relates to the technical field of deep learning, in particular to a neural network training, image detection and driving control method, device, electronic equipment and storage medium.
Background
With the development of technology, an autopilot function, a driver assist function, and the like are widely used in vehicles. Among them, in the automatic driving function and the auxiliary driving function, it is important to detect a traveling vehicle on a road.
Generally, a camera may be disposed on a vehicle, an image is collected by the camera, a running vehicle is detected according to the collected image, a two-dimensional detection result of the running vehicle is obtained, and the obtained two-dimensional detection result of the running vehicle is input into a downstream module such as a tracking module and a ranging module to obtain three-dimensional information of the running vehicle, however, because the collected image lacks space information, the accuracy of the three-dimensional information of the running vehicle obtained based on the two-dimensional detection result is lower.
Disclosure of Invention
In view of this, the present disclosure provides at least a neural network training, image detection, driving control method, apparatus, electronic device, and storage medium.
In a first aspect, the present disclosure provides a neural network training method, comprising:
The method comprises the steps of acquiring a sample image and two-dimensional annotation data of the sample image, wherein the two-dimensional annotation data comprises detection frame information of a target vehicle in the sample image and at least one attribute information capable of representing the three-dimensional pose of the target vehicle;
Training a target neural network comprising a plurality of branch networks based on the sample image and the two-dimensional annotation data;
Wherein, after the sample image is input to the target neural network, each branch network outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle, respectively.
According to the method, the sample image and the two-dimensional annotation data of the sample image are obtained, the two-dimensional annotation data comprise detection frame information of the target vehicle and at least one attribute information capable of representing the three-dimensional pose of the target vehicle, the sample image and the two-dimensional annotation data are utilized to train the target neural network, and the two-dimensional annotation data comprise the at least one attribute information capable of representing the three-dimensional pose of the target vehicle, so that the trained target neural network can output and forecast to obtain the at least one attribute information, the data type for determining the three-dimensional pose information of the target vehicle is enriched, and then the three-dimensional pose information of the target vehicle can be determined more accurately by utilizing the forecast output by the target neural network to obtain the at least one attribute information and the detection frame information.
Meanwhile, the target neural network comprises a plurality of branch networks, after the sample image is input into the target neural network, each branch network respectively outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of representing the three-dimensional pose of the target vehicle, and the two-dimensional detection frame information and the at least one attribute information are output in parallel, so that the information detection efficiency is improved.
In a possible implementation manner, the attribute information includes at least one of the following:
First position information of any demarcation point on a demarcation line between adjacent visible faces of the target vehicle, second position information of a contact point between at least one visible wheel of the target vehicle and the ground, orientation information of the target vehicle;
the orientation information comprises first orientation information and/or second orientation information, and the second orientation indicated by the second orientation information is covered in the first orientation indicated by the first orientation information.
Here, a plurality of attribute information are set, so that the content of the attribute information is enriched, and the three-dimensional pose information of the target vehicle can be accurately determined according to at least one attribute information.
In a possible embodiment, where the orientation information includes first orientation information, the first orientation information includes a first category that characterizes a front of the vehicle and is not visible behind the vehicle, a second category that characterizes a rear of the vehicle and is not visible on the front of the vehicle, and a third category that characterizes a front of the vehicle and is not visible behind the vehicle.
In a possible embodiment, in case the orientation information comprises second orientation information, the second orientation information comprises a first intermediate category in which the front and rear sides of the vehicle are not visible and the left side of the vehicle is visible, a second intermediate category in which the front and rear sides of the vehicle are not visible and the right side of the vehicle is visible, a third intermediate category in which the rear and front sides of the vehicle are not visible and the right side of the vehicle is not visible, a fourth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the right side of the vehicle is visible, a fifth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the left side of the vehicle is visible, a sixth intermediate category in which the front side of the vehicle is visible, the rear side of the vehicle is not visible, the right side of the vehicle is visible, a seventh intermediate category in which the front side of the vehicle is visible, the rear of the vehicle is not visible and the left side of the vehicle is visible.
By adopting the method, the orientation of the target vehicle can be accurately represented through the set first orientation information and/or second orientation information.
In a possible implementation manner, in a case that the attribute information includes first position information of the demarcation point, acquiring two-dimensional labeling data of the sample image includes:
acquiring the coordinate information of the demarcation point in the horizontal direction of an image coordinate system corresponding to the sample image;
and determining the coordinate information of the target vehicle in the vertical direction indicated by the detection frame information as the coordinate information of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image.
By adopting the method, the coordinate information of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image can be determined, the coordinate information of the demarcation point in the horizontal direction can be obtained, and further, when the target neural network determines the first position information of the demarcation point, the coordinate information of the demarcation point in the horizontal direction can be determined, and the coordinate information of the demarcation point in the vertical direction does not need to be determined, so that the regression data types are reduced, the influence on other regression data (such as detection frame information) except the first position information of the demarcation point when the regression data types are more can be avoided, and the accuracy of other regression data is reduced.
In a possible implementation manner, in a case that the attribute information includes the second position information of the contact point, acquiring two-dimensional labeling data of the sample image includes:
The method comprises the steps of obtaining the coordinate information of the contact point in the horizontal direction of an image coordinate system corresponding to the sample image, determining the coordinate information of the target vehicle in the vertical direction indicated by the detection frame information as the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image, and/or,
And determining the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image as the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image.
By adopting the method, one coordinate information in the second position information of the contact point can be determined, the other coordinate information in the second position information of the contact point can be acquired, the regression data types are reduced, the influence on other regression data (such as detection frame information, detection frame types and the like) except the second position information of the contact point when the regression data types are more can be avoided, and the accuracy of the other regression data is reduced.
In a second aspect, the present disclosure provides an image detection method, the method comprising:
acquiring an image to be detected;
Inputting the image to be detected into a trained target neural network comprising a plurality of branch networks, and obtaining two-dimensional detection frame information of a vehicle in the image to be detected, which is output by the plurality of branch networks in parallel, and at least one attribute information capable of representing the three-dimensional pose of the vehicle;
And determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
In the method, the target neural network is obtained based on the neural network training method in the first aspect, so that the trained target neural network can more accurately output the two-dimensional detection frame information and at least one attribute information of the vehicle, and further, the detection result of the image to be detected can be more accurately determined based on the two-dimensional detection frame information and the at least one attribute information of the vehicle.
In a possible implementation manner, determining a detection result of the image to be detected based on two-dimensional detection frame information of the vehicle and the at least one attribute information includes:
Depth information of the vehicle is determined based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
In a possible implementation manner, determining a detection result of the image to be detected based on two-dimensional detection frame information of the vehicle and the at least one attribute information includes:
And determining three-dimensional detection data of the vehicle based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected and at least one attribute information corresponding to the vehicle.
In a possible embodiment, in a case where the at least one attribute information includes first position information of any demarcation point on a demarcation line between adjacent visible faces of the vehicle and second position information of a contact point between at least one visible wheel of the vehicle and the ground, the determining three-dimensional detection data of the vehicle based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle includes:
determining two-dimensional compact frame information characterizing a single plane of the vehicle based on the first location information of the demarcation point, the second location information of the contact point, and the two-dimensional detection frame information of the vehicle;
and determining three-dimensional detection data of the vehicle based on the two-dimensional compact frame information and the image to be detected.
By adopting the method, because the space information contained in the two-dimensional compact frame information is more accurate, the three-dimensional detection data of the vehicle can be more accurately determined based on the two-dimensional compact frame information and the image to be detected.
In a third aspect, the present disclosure provides a running control method including:
acquiring a road image acquired by a running device in the running process;
Detecting the road image by using the target neural network trained by the neural network training method according to any one of the first aspect to obtain target detection data of a target vehicle included in the road image;
the running apparatus is controlled based on target detection data of a target vehicle included in the road image.
In a fourth aspect, the present disclosure provides a neural network training device, comprising:
The system comprises a first acquisition module, a second acquisition module and a first detection module, wherein the first acquisition module is used for acquiring a sample image and two-dimensional annotation data of the sample image, and the two-dimensional annotation data comprises detection frame information of a target vehicle in the sample image and at least one attribute information capable of representing the three-dimensional pose of the target vehicle;
the training module is used for training a target neural network comprising a plurality of branch networks based on the sample image and the two-dimensional annotation data;
Wherein, after the sample image is input to the target neural network, each branch network outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle, respectively.
In a fifth aspect, the present disclosure provides an image detection apparatus including:
The second acquisition module is used for acquiring an image to be detected;
The first generation module is used for inputting the image to be detected into a trained target neural network which comprises a plurality of branch networks, obtaining two-dimensional detection frame information of a vehicle in the image to be detected, which is output by the plurality of branch networks in parallel, and at least one attribute information capable of representing the three-dimensional pose of the vehicle;
and the determining module is used for determining the detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
In a sixth aspect, the present disclosure provides a travel control apparatus including:
the third acquisition module is used for acquiring road images acquired by the driving device in the driving process;
The second generating module is configured to detect the road image by using the target neural network trained by the neural network training method according to any one of the first aspects, so as to obtain target detection data of a target vehicle included in the road image;
and a control module for controlling the running apparatus based on target detection data of a target vehicle included in the road image.
In a seventh aspect, the present disclosure provides an electronic device comprising a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication over the bus when the electronic device is in operation, the machine-readable instructions when executed by the processor performing the steps of the neural network training method of the first aspect or any of the embodiments described above, or performing the steps of the image detection method of the second aspect or any of the embodiments described above, or performing the steps of the travel control method of the third aspect described above, when executed.
In an eighth aspect, the present disclosure provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor performs the steps of the neural network training method according to the first aspect or any of the embodiments, or performs the steps of the image detection method according to the second aspect or any of the embodiments, or performs the steps of the travel control method according to the third aspect.
The foregoing objects, features and advantages of the disclosure will be more readily apparent from the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.
Fig. 1 is a schematic flow chart of a neural network training method according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a sample image in a neural network training method according to an embodiment of the disclosure;
fig. 3 is a schematic diagram of a target vehicle in a neural network training method according to an embodiment of the disclosure;
fig. 4 is a schematic diagram of a target neural network in a neural network training method according to an embodiment of the disclosure;
fig. 5 shows a flowchart of an image detection method according to an embodiment of the disclosure;
Fig. 6 is a schematic diagram of a two-dimensional compacting frame in an image detection method according to an embodiment of the disclosure;
Fig. 7 is a schematic flow chart of a driving control method according to an embodiment of the disclosure;
FIG. 8 illustrates a schematic architecture of a neural network training device provided by embodiments of the present disclosure;
fig. 9 shows a schematic architecture diagram of an image detection apparatus according to an embodiment of the disclosure;
fig. 10 shows a schematic structural diagram of a travel control device according to an embodiment of the present disclosure;
fig. 11 shows a schematic structural diagram of an electronic device according to an embodiment of the disclosure;
FIG. 12 illustrates a schematic diagram of another electronic device provided by an embodiment of the present disclosure;
Fig. 13 shows a schematic structural diagram of another electronic device provided in an embodiment of the disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments. The components of the embodiments of the present disclosure, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.
Generally, a camera may be disposed on a vehicle, an image is collected by the camera, a running vehicle is detected according to the collected image, a two-dimensional detection result of the running vehicle is obtained, and the obtained two-dimensional detection result of the running vehicle is input into a downstream module such as a tracking module and a ranging module to obtain three-dimensional information of the running vehicle, however, because the collected image lacks space information, the accuracy of the three-dimensional information of the running vehicle obtained based on the two-dimensional detection result is lower. In order to alleviate the above-mentioned problems, embodiments of the present disclosure provide a neural network training method.
The defects of the scheme are all results obtained by the inventor after practice and careful study, and therefore, the discovery process of the above problems and the solutions to the above problems set forth hereinafter by the present disclosure should be all contributions of the inventors to the present disclosure during the course of the present disclosure.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
For the convenience of understanding the embodiments of the present disclosure, a neural network training method, an image detection method and a driving control method disclosed in the embodiments of the present disclosure will be described in detail. The execution subject of the neural network training method, the image detection method, and the driving control method provided in the embodiments of the present disclosure is generally a computer device with a certain computing capability, where the computer device includes, for example, a terminal device or a server or other processing devices, and the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and so on. In some possible implementations, the neural network training method, the image detection method, and the driving control method may be implemented by a processor calling computer readable instructions stored in a memory.
Referring to fig. 1, a flowchart of a neural network training method according to an embodiment of the disclosure is shown, where the method includes S101-S102, where:
S101, acquiring a sample image and two-dimensional annotation data of the sample image, wherein the two-dimensional annotation data comprises detection frame information of a target vehicle in the sample image and at least one attribute information capable of representing the three-dimensional pose of the target vehicle.
S102, training a target neural network comprising a plurality of branch networks based on the sample image and the two-dimensional labeling data.
Wherein, after the sample image is input to the target neural network, each branch network outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle, respectively.
According to the method, the sample image and the two-dimensional annotation data of the sample image are obtained, the two-dimensional annotation data comprise detection frame information of the target vehicle and at least one attribute information capable of representing the three-dimensional pose of the target vehicle, the sample image and the two-dimensional annotation data are utilized to train the target neural network, and the two-dimensional annotation data comprise the at least one attribute information capable of representing the three-dimensional pose of the target vehicle, so that the trained target neural network can output and forecast to obtain the at least one attribute information, the data type for determining the three-dimensional pose information of the target vehicle is enriched, and then the three-dimensional pose information of the target vehicle can be determined more accurately by utilizing the forecast output by the target neural network to obtain the at least one attribute information and the detection frame information.
Meanwhile, the target neural network comprises a plurality of branch networks, after the sample image is input into the target neural network, each branch network respectively outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of representing the three-dimensional pose of the target vehicle, and the two-dimensional detection frame information and the at least one attribute information are output in parallel, so that the information detection efficiency is improved.
S101-S102 are specifically described below.
For S101:
the sample image may be any acquired image containing the target vehicle. The image may be an image containing the target vehicle in any scene, for example, an image of the target vehicle traveling on a road, an image of the target vehicle parked on a parking space, or the like may be included in the sample image. The target vehicle may be any motor vehicle, for example, a truck, a car, a minibus, or the like.
The two-dimensional annotation data comprises detection frame information of the target vehicle in the sample image, wherein the detection frame information can comprise position information of the detection frame, size information of the detection frame, types of the detection frame and the like, for example, the types of the detection frame can comprise small vehicles, medium vehicles, large vehicles and the like, or the types can also comprise cars, minibuses, off-road vehicles and the like, and the types can also be a first type belonging to the motor vehicle and a second type not belonging to the motor vehicle.
The detection frame information may be position information of four vertexes of the detection frame, a category of the detection frame, or the detection frame information may be position information of a center point of the detection frame, size information of the detection frame, a category of the detection frame, or the like.
The at least one attribute information capable of characterizing the three-dimensional pose of the target vehicle can comprise at least one of first position information of any demarcation point on a demarcation line between adjacent visible faces of the target vehicle, second position information of a contact point between at least one visible wheel of the target vehicle and the ground, and orientation information of the target vehicle, wherein the orientation information comprises first orientation information and/or second orientation information, and the second orientation indicated by the second orientation information is covered in the first orientation indicated by the first orientation information. Here, a plurality of attribute information are set, so that the content of the attribute information is enriched, and the three-dimensional pose information of the target vehicle can be accurately determined according to at least one attribute information.
Any demarcation point of the target vehicle may be any point on a demarcation line between two adjacent visible surfaces of the target vehicle on the sample image, for example, the demarcation line may be a line perpendicular to the ground where a lamp on the target vehicle is located, and the demarcation point may be an intersection point of the demarcation line and the detection frame. The visible wheel may be the wheel on the side of the visible face of the target vehicle contained in the sample image, and preferably the number of contact points may be two, i.e. the contact points of the two visible wheels with the ground.
Referring to fig. 2, a schematic diagram of a sample image includes a demarcation point 21, a contact point between two wheels and the ground, i.e., a first contact point 22, and a second contact point 23.
In an alternative embodiment, in the case where the attribute information includes the first location information of the demarcation point, in S101, acquiring the two-dimensional labeling data of the sample image may include:
s1011, acquiring the coordinate information of the demarcation point in the horizontal direction of an image coordinate system corresponding to the sample image;
S1012, determining the coordinate information of the target vehicle in the vertical direction indicated by the detection frame information as the coordinate information of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image.
Here, the first position information of the demarcation point includes abscissa information and ordinate information, that is, the abscissa information is coordinate information in the horizontal direction in the image coordinate system corresponding to the sample image, and the ordinate information is coordinate information in the vertical direction in the image coordinate system corresponding to the sample image.
Taking the demarcation point included in fig. 2 as an example, the detection frame information of the target object included in the sample image of fig. 2 may include position information of four vertices, i.e., (x 1,y1)、(x1,y2)、(x2,y1)、(x2,y2). Further, coordinate information x A of the demarcation point 21 in the horizontal direction of the image coordinate system corresponding to the sample image may be obtained, and coordinate information in the vertical direction indicated by the detection frame information of the target vehicle may be determined as coordinate information y 2 of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image, so that the first position information of the demarcation point may be obtained as (x A,y2).
By adopting the method, the coordinate information of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image can be determined, the coordinate information of the demarcation point in the horizontal direction can be obtained, and further, when the target neural network regresses to determine the first position information of the demarcation point, the coordinate information of the demarcation point in the horizontal direction can be determined, and the coordinate information of the demarcation point in the vertical direction does not need to be determined, so that the regression data types are reduced, the influence on other regression data (such as detection frame information) except the first position information of the demarcation point when the regression data types are more can be avoided, and the accuracy of other regression data is reduced.
In an alternative embodiment, in the case where the attribute information includes the second position information of the contact point, in S101, acquiring the two-dimensional labeling data of the sample image may include:
the method comprises the steps of firstly, acquiring coordinate information of a contact point in the horizontal direction of an image coordinate system corresponding to a sample image, and determining the coordinate information of a target vehicle in the vertical direction indicated by detection frame information as the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image;
And determining the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image as the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image.
In one aspect, the abscissa information of the contact point may be obtained, and the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image may be determined as the ordinate information of the contact point. In the second mode, ordinate information of the contact point and coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image can be acquired, and the coordinate information is determined as abscissa information of the contact point.
When the number of contact points is two, the second position information of one contact point may be determined in the first alternative, and the second position information of the other contact point may be determined in the second alternative.
Taking the touch point included in fig. 2 as an example, coordinate information x B1 of the first touch point 22 in the horizontal direction of the image coordinate system corresponding to the sample image may be obtained, and coordinate information of the vertical direction indicated by the detection frame information of the target vehicle may be determined as coordinate information y 2 of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image, that is, the second position information of the first touch point 22 may be (x B1,y2).
For the second contact point 23, coordinate information y B2 of the second contact point 23 in the vertical direction of the image coordinate system corresponding to the sample image may be acquired, and coordinate information of the horizontal direction indicated by the detection frame information of the target vehicle may be determined as coordinate information x 1 of the first contact point 22 in the horizontal direction of the image coordinate system corresponding to the sample image. I.e. the second position information of the first contact point 22 may be (x 1,yB2).
The selection of x 1 or x 2 may be determined according to the orientation information of the target vehicle in the sample image, or may be determined according to the location of the first contact point in the sample image. For example, x 2 is selected if the second contact point is to the right of the first contact point, and x 1 is selected if the first contact point is to the left of the second contact point. Or if the direction information of the target vehicle is in the fourth intermediate category or the seventh intermediate category, determining to select x 1, and if the direction information of the target vehicle is in the fifth intermediate category or the eighth intermediate category, determining to select x 2.
By adopting the method, one coordinate information in the second position information of the contact point can be determined, and the other coordinate information in the second position information of the contact point is acquired, so that regression data types are reduced, influence on other regression data (such as detection frame information, detection frame type and the like) except the second position information of the contact point when the regression data types are more can be avoided, and accuracy of the other regression data is reduced.
In an alternative embodiment, where the orientation information includes first orientation information, the first orientation information includes a first category that characterizes a front of the vehicle and an rear of the vehicle as invisible, a second category that characterizes a rear of the vehicle as visible and a front of the vehicle as invisible, and a third category that characterizes a front of the vehicle as visible and a rear of the vehicle as invisible.
In an alternative embodiment, where the orientation information includes second orientation information, the second orientation information includes a first intermediate category in which the front and rear sides of the vehicle are not visible and the left side of the vehicle is visible, a second intermediate category in which the front and rear sides of the vehicle are not visible and the right side of the vehicle is visible, a third intermediate category in which the rear and front sides of the vehicle are not visible and the right side of the vehicle is not visible, a fourth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the right side of the vehicle is visible, a fifth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the left side of the vehicle is visible, a sixth intermediate category in which the front side of the vehicle is not visible, the rear side of the vehicle is not visible and the right side of the vehicle is visible, and an eighth intermediate category in which the front side of the vehicle is visible, the rear of the vehicle is not visible and the left side of the vehicle is visible.
The second direction indicated by the second direction information covers the first direction indicated by the first direction information, namely the first category of the first direction information can comprise a first middle category and a second middle category in the second direction information, the second category of the first direction information can comprise a third middle category, a fourth middle category and a fifth middle category in the second direction information, and the third category of the first direction information can comprise a sixth middle category, a seventh middle category and an eighth middle category in the second direction information.
It can be seen that the first orientation information of the target vehicle included in fig. 2 is of the third category, and the second orientation information is of the seventh intermediate category.
Referring to fig. 3, a schematic diagram of a target vehicle is shown. For example, in fig. 3, the first orientation information corresponding to the target vehicle 31 may be a first type, the second orientation information may be a first intermediate type, the first orientation information corresponding to the target vehicle 32 may be a third type, the second orientation information may be an eighth intermediate type, the first orientation information corresponding to the target vehicle 33 may be a second type, the second orientation information may be a third intermediate type, the first orientation information corresponding to the target vehicle 34 may be a second type, and the second orientation information may be a fourth intermediate type.
For S102:
The sample image containing the two-dimensional labeling data can be input into a target neural network to be trained, wherein the target neural network comprises a plurality of score networks, and the target neural network is trained for a plurality of times until the accuracy of the trained target neural network is greater than a set accuracy threshold value or until the loss value of the trained target neural network is less than the set loss threshold value, so that the trained target neural network is obtained.
After the sample image is input into the target neural network, each branch network respectively outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of representing the three-dimensional pose of the target vehicle, namely a plurality of branch networks in the trained target neural network, wherein the branch networks are used for outputting the two-dimensional detection frame information of the vehicle and the at least one attribute information capable of representing the three-dimensional pose of the vehicle in parallel for any image to be detected.
Referring to fig. 4, a schematic diagram of a target neural network is shown. The sample image 41, the backbone network 42, and the plurality of branch networks 43 are included in fig. 4, and the plurality of branch networks may include a branch network corresponding to the detection frame information, a branch network corresponding to the category of the target vehicle, a branch network corresponding to the first position information of the demarcation point, a branch network corresponding to the second position information of the first contact point, a branch network corresponding to the second position information of the second contact point, and a branch network corresponding to the orientation information.
Referring to fig. 5, a flowchart of an image detection method according to an embodiment of the disclosure is shown, where the method includes S501-S503, where:
s501, acquiring an image to be detected;
S502, inputting an image to be detected into a trained target neural network comprising a plurality of branch networks, and obtaining two-dimensional detection frame information of a vehicle in the image to be detected, which is output by the plurality of branch networks in parallel, and at least one attribute information capable of representing the three-dimensional pose of the vehicle;
S503, determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and at least one attribute information.
In the method, the target neural network is obtained based on the neural network training method in the first aspect, so that the trained target neural network can more accurately output the two-dimensional detection frame information and at least one attribute information of the vehicle, and further, the detection result of the image to be detected can be more accurately determined based on the two-dimensional detection frame information and the at least one attribute information of the vehicle.
For S501 and S502:
the image to be detected can be any image, the acquired image to be detected is input into a trained target neural network comprising a plurality of branch networks, and two-dimensional detection frame information of the vehicle and at least one attribute information capable of representing the three-dimensional pose of the vehicle are obtained, wherein the two-dimensional detection frame information of the vehicle is included in the image to be detected and is output by the plurality of branch networks in parallel. For example, two-dimensional detection frame information, first position information of a demarcation point, second position information of a contact point, orientation information and the like corresponding to the vehicle included in the image to be detected can be obtained.
For S503:
in an alternative embodiment, determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and at least one attribute information includes:
depth information of the vehicle is determined based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
The depth information of the vehicle may be a distance between a center of the vehicle and an image capturing device that captures an image to be detected. For example, the bird's eye view corresponding to the vehicle may be determined according to the two-dimensional detection frame information and at least one attribute information of the vehicle through a coordinate transformation operation process, and further the depth information of the vehicle may be determined according to the bird's eye view, or the trained neural network for determining the depth information may be used to determine the depth information of the vehicle according to the two-dimensional detection frame information and at least one attribute information of the vehicle.
In an alternative embodiment, determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and at least one attribute information includes:
and determining three-dimensional detection data of the vehicle based on the image to be detected, the two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle.
For example, the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle may be input into a trained three-dimensional detection neural network, and three-dimensional detection data of the vehicle may be determined. The three-dimensional detection data of the vehicle may include position information of a three-dimensional detection frame of the vehicle, size information of the three-dimensional detection frame, a category of the three-dimensional detection frame, and the like.
In an alternative embodiment, in the case that the at least one attribute information includes first position information of any demarcation point on a demarcation line between adjacent visible faces of the vehicle and second position information of a contact point between at least one visible wheel of the vehicle and the ground, determining three-dimensional detection data of the vehicle based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle, includes:
determining two-dimensional compact frame information representing a single plane of a vehicle based on first position information of a demarcation point, second position information of a contact point and two-dimensional detection frame information of the vehicle;
And step two, determining three-dimensional detection data of the vehicle based on the two-dimensional compact frame information and the image to be detected.
When the at least one attribute information includes a demarcation point, the two-dimensional detection frame of the vehicle may be divided into two detection frames using the first location information of the demarcation point corresponding to the vehicle, and the detection frames including the contact points may be determined as two-dimensional compact frame information characterizing a single plane of the vehicle. Referring to fig. 6, a schematic diagram of a two-dimensional compact frame is shown, and the detection frame on the right side of the dividing line is a determined two-dimensional compact frame, where 21 in fig. 6 is a dividing point, and 22 and 23 are two different contact points.
And then the two-dimensional compact frame information and the image to be detected can be input into a trained three-dimensional detection neural network to determine three-dimensional detection data of the vehicle.
By adopting the method, because the space information contained in the two-dimensional compact frame information is more accurate, the three-dimensional detection data of the vehicle can be more accurately determined based on the two-dimensional compact frame information and the image to be detected.
Referring to fig. 7, a flow chart of a driving control method according to an embodiment of the disclosure is shown, where the method includes S701-S703, where:
S701, acquiring a road image acquired by a running device in the running process;
S702, detecting a road image by using the target neural network trained by the neural network training method described in the embodiment to obtain target detection data of a target vehicle included in the road image;
S703, controlling the traveling apparatus based on the target detection data of the target vehicle included in the road image.
By way of example, the running gear may be an autonomous vehicle, a vehicle equipped with an advanced driving assistance system (ADVANCED DRIVING ASSISTANCE SYSTEM, ADAS), or a robot, etc. The road image may be an image acquired by the driving apparatus in real time during driving.
When the running device is controlled, the running device can be controlled to accelerate, decelerate, turn, brake and the like, or voice prompt information can be played to prompt a driver to control the running device to accelerate, decelerate, turn, brake and the like.
In the specific implementation, the road image can be input into the trained target neural network, the road image is detected to obtain target detection data of the target vehicle included in the road image, and the running device can be controlled based on the target detection data of the target vehicle. The control of the traveling apparatus based on the target detection data of the target vehicle may include inputting the target detection data of the target vehicle into a trained three-dimensional detection neural network, obtaining three-dimensional detection information of the target object, and controlling the traveling apparatus based on the detected three-dimensional detection information of the target object.
It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.
Based on the same concept, the embodiment of the present disclosure further provides a neural network training device, which is shown in fig. 8, and is a schematic architecture diagram of the neural network training device provided by the embodiment of the present disclosure, including a first obtaining module 801, a training module 802, and specifically:
The first acquisition module 801 is configured to acquire a sample image and two-dimensional annotation data of the sample image, where the two-dimensional annotation data includes detection frame information of a target vehicle in the sample image and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle;
A training module 802 for training a target neural network including a plurality of branch networks based on the sample image and the two-dimensional annotation data;
Wherein, after the sample image is input to the target neural network, each branch network outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle, respectively.
In a possible implementation manner, the attribute information includes at least one of the following:
First position information of any demarcation point on a demarcation line between adjacent visible faces of the target vehicle, second position information of a contact point between at least one visible wheel of the target vehicle and the ground, orientation information of the target vehicle;
the orientation information comprises first orientation information and/or second orientation information, and the second orientation indicated by the second orientation information is covered in the first orientation indicated by the first orientation information.
In a possible embodiment, where the orientation information includes first orientation information, the first orientation information includes a first category that characterizes a front of the vehicle and is not visible behind the vehicle, a second category that characterizes a rear of the vehicle and is not visible on the front of the vehicle, and a third category that characterizes a front of the vehicle and is not visible behind the vehicle.
In a possible embodiment, in case the orientation information comprises second orientation information, the second orientation information comprises a first intermediate category in which the front and rear sides of the vehicle are not visible and the left side of the vehicle is visible, a second intermediate category in which the front and rear sides of the vehicle are not visible and the right side of the vehicle is visible, a third intermediate category in which the rear and front sides of the vehicle are not visible and the right side of the vehicle is not visible, a fourth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the right side of the vehicle is visible, a fifth intermediate category in which the rear side of the vehicle is visible, the front side of the vehicle is not visible and the left side of the vehicle is visible, a sixth intermediate category in which the front side of the vehicle is visible, the rear side of the vehicle is not visible, the right side of the vehicle is visible, a seventh intermediate category in which the front side of the vehicle is visible, the rear of the vehicle is not visible and the left side of the vehicle is visible.
In a possible implementation manner, in a case where the attribute information includes the first position information of the demarcation point, the first obtaining module 801 is configured to, when obtaining the two-dimensional labeling data of the sample image:
acquiring the coordinate information of the demarcation point in the horizontal direction of an image coordinate system corresponding to the sample image;
and determining the coordinate information of the target vehicle in the vertical direction indicated by the detection frame information as the coordinate information of the demarcation point in the vertical direction of the image coordinate system corresponding to the sample image.
In a possible implementation manner, in a case where the attribute information includes the second position information of the contact point, the first obtaining module 801 is configured to, when obtaining the two-dimensional labeling data of the sample image:
The method comprises the steps of obtaining the coordinate information of the contact point in the horizontal direction of an image coordinate system corresponding to the sample image, determining the coordinate information of the target vehicle in the vertical direction indicated by the detection frame information as the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image, and/or,
And determining the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image as the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image.
Based on the same concept, the embodiment of the present disclosure further provides an image detection apparatus, referring to fig. 9, which is a schematic structural diagram of the image detection apparatus provided by the embodiment of the present disclosure, including a second obtaining module 901, a first generating module 902, and a determining module 903, specifically:
a second acquiring module 901, configured to acquire an image to be detected;
A first generating module 902, configured to input the image to be detected to a trained target neural network that includes a plurality of branch networks, obtain two-dimensional detection frame information of a vehicle in the image to be detected that is output by the plurality of branch networks in parallel, and at least one attribute information that can represent a three-dimensional pose of the vehicle;
A determining module 903, configured to determine a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
In a possible implementation manner, the determining module 903 is configured to, when determining a detection result of the image to be detected based on two-dimensional detection frame information of the vehicle and the at least one attribute information,:
Depth information of the vehicle is determined based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
In a possible implementation manner, the determining module 903 is configured to, when determining a detection result of the image to be detected based on two-dimensional detection frame information of the vehicle and the at least one attribute information,:
And determining three-dimensional detection data of the vehicle based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected and at least one attribute information corresponding to the vehicle.
In a possible implementation manner, in a case where the at least one attribute information includes first position information of any demarcation point on a demarcation line between adjacent visible faces of the vehicle and second position information of a contact point between at least one visible wheel of the vehicle and the ground, the determining module 903 is configured, when determining three-dimensional detection data of the vehicle based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle, to:
determining two-dimensional compact frame information characterizing a single plane of the vehicle based on the first location information of the demarcation point, the second location information of the contact point, and the two-dimensional detection frame information of the vehicle;
and determining three-dimensional detection data of the vehicle based on the two-dimensional compact frame information and the image to be detected.
Based on the same concept, the embodiment of the present disclosure further provides a running control apparatus, referring to fig. 10, which is a schematic structural diagram of the running control apparatus provided by the embodiment of the present disclosure, including a third obtaining module 1001, a second generating module 1002, and a control module 1003, specifically:
a third acquiring module 1001, configured to acquire a road image acquired by a driving device during driving;
A second generating module 1002, configured to detect the road image by using the target neural network trained by the neural network training method described in the foregoing embodiment, to obtain target detection data of a target vehicle included in the road image;
A control module 1003 for controlling the traveling apparatus based on target detection data of a target vehicle included in the road image.
In some embodiments, the functions or templates included in the apparatus provided by the embodiments of the present disclosure may be used to perform the methods described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
Based on the same technical concept, the embodiment of the disclosure also provides electronic equipment. Referring to fig. 11, a schematic structural diagram of an electronic device according to an embodiment of the disclosure includes a processor 1101, a memory 1102, and a bus 1103. The memory 1102 is used for storing execution instructions, including a memory 11021 and an external memory 11022, where the memory 11021 is also called an internal memory, and is used for temporarily storing operation data in the processor 1101 and data exchanged with the external memory 11022 such as a hard disk, the processor 1101 exchanges data with the external memory 11022 through the memory 11021, and when the electronic device 1100 operates, the processor 1101 and the memory 1102 communicate with each other through the bus 1103, so that the processor 1101 executes the following instructions:
The method comprises the steps of acquiring a sample image and two-dimensional annotation data of the sample image, wherein the two-dimensional annotation data comprises detection frame information of a target vehicle in the sample image and at least one attribute information capable of representing the three-dimensional pose of the target vehicle;
Training a target neural network comprising a plurality of branch networks based on the sample image and the two-dimensional annotation data;
Wherein, after the sample image is input to the target neural network, each branch network outputs one of two-dimensional detection frame information of the target vehicle and at least one attribute information capable of characterizing a three-dimensional pose of the target vehicle, respectively.
Based on the same technical concept, the embodiment of the disclosure also provides electronic equipment. Referring to fig. 12, a schematic structural diagram of an electronic device according to an embodiment of the disclosure includes a processor 1201, a memory 1202, and a bus 1203. The memory 1202 is configured to store execution instructions, including a memory 12021 and an external memory 12022, where the memory 12021 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 1201 and data exchanged with the external memory 12022, such as a hard disk, where the processor 1201 exchanges data with the external memory 12022 through the memory 12021, and when the electronic device 1200 is running, the processor 1201 and the memory 1202 communicate with each other through the bus 1203, so that the processor 1201 executes the following instructions:
acquiring an image to be detected;
Inputting the image to be detected into a trained target neural network comprising a plurality of branch networks, and obtaining two-dimensional detection frame information of a vehicle in the image to be detected, which is output by the plurality of branch networks in parallel, and at least one attribute information capable of representing the three-dimensional pose of the vehicle;
And determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information.
Based on the same technical concept, the embodiment of the disclosure also provides electronic equipment. Referring to fig. 13, a schematic structural diagram of an electronic device according to an embodiment of the disclosure includes a processor 1301, a memory 1302, and a bus 1303. The memory 1302 is configured to store execution instructions, including a memory 13021 and an external memory 13022, where the memory 13021 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 1301 and data exchanged with the external memory 13022, such as a hard disk, where the processor 1301 exchanges data with the external memory 13022 through the memory 13021, and when the electronic device 1300 is running, the processor 1301 and the memory 1302 communicate with each other through the bus 1303, so that the processor 1301 executes the following instructions:
acquiring a road image acquired by a running device in the running process;
Detecting the road image by using the target neural network trained by the neural network training method described in the above embodiment to obtain target detection data of a target vehicle included in the road image;
the running apparatus is controlled based on target detection data of a target vehicle included in the road image.
In addition, the embodiment of the present disclosure further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, performs the steps of the neural network training method, the image detection method, and the travel control method described in the above method embodiments. Wherein the storage medium may be a volatile or nonvolatile computer readable storage medium.
The embodiments of the present disclosure further provide a computer program product, where the computer program product carries program codes, and instructions included in the program codes may be used to execute the steps of the neural network training method, the image detection method, and the driving control method described in the foregoing method embodiments, and specifically, reference may be made to the foregoing method embodiments, which are not repeated herein.
Wherein the above-mentioned computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The foregoing is merely a specific embodiment of the disclosure, but the protection scope of the disclosure is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the disclosure, and it should be covered in the protection scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (16)

1.一种神经网络训练方法,其特征在于,包括:1. A neural network training method, comprising: 获取样本图像,以及所述样本图像的二维标注数据;所述二维标注数据包括所述样本图像中目标车辆的检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息;Acquire a sample image and two-dimensional annotation data of the sample image; the two-dimensional annotation data includes detection frame information of a target vehicle in the sample image and at least one attribute information capable of characterizing a three-dimensional position and posture of the target vehicle; 基于所述样本图像和所述二维标注数据,训练包括多个分支网络的目标神经网络;Based on the sample image and the two-dimensional annotation data, training a target neural network including a plurality of branch networks; 其中,在将所述样本图像输入到所述目标神经网络后,每个分支网络分别输出以下信息中的一个:所述目标车辆的二维检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息;After the sample image is input into the target neural network, each branch network outputs one of the following information: two-dimensional detection frame information of the target vehicle, and at least one attribute information capable of characterizing the three-dimensional posture of the target vehicle; 所述属性信息包括以下至少一种:所述目标车辆的相邻可见面之间的分界线上的任一分界点的第一位置信息、所述目标车辆的至少一个可见车轮与地面之间的接触点的第二位置信息。The attribute information includes at least one of the following: first position information of any boundary point on a boundary line between adjacent visible surfaces of the target vehicle, and second position information of a contact point between at least one visible wheel of the target vehicle and the ground. 2.根据权利要求1所述的方法,其特征在于,所述属性信息还包括:所述目标车辆的朝向信息;2. The method according to claim 1, characterized in that the attribute information further includes: orientation information of the target vehicle; 其中,所述朝向信息包括第一朝向信息和/或第二朝向信息,所述第二朝向信息指示的第二朝向涵盖于所述第一朝向信息指示的第一朝向中。The orientation information includes first orientation information and/or second orientation information, and the second orientation indicated by the second orientation information is included in the first orientation indicated by the first orientation information. 3.根据权利要求2所述的方法,其特征在于,在所述朝向信息包括第一朝向信息的情况下,所述第一朝向信息包括:表征车辆正面和车辆后面不可见的第一类别、表征车辆后面可见且车辆正面不可见的第二类别、表征车辆正面可见且车辆后面不可见的第三类别。3. The method according to claim 2 is characterized in that, when the orientation information includes first orientation information, the first orientation information includes: a first category characterizing that the front of the vehicle and the rear of the vehicle are invisible, a second category characterizing that the rear of the vehicle is visible and the front of the vehicle is invisible, and a third category characterizing that the front of the vehicle is visible and the rear of the vehicle is invisible. 4.根据权利要求2或3所述的方法,其特征在于,在所述朝向信息包括第二朝向信息的情况下,所述第二朝向信息包括:车辆正面和车辆后面不可见、且车辆左侧面可见的第一中间类别;车辆正面和车辆后面不可见、且车辆右侧面可见的第二中间类别;车辆后面可见和车辆正面不可见、且车辆侧面不可见的第三中间类别;车辆后面可见、车辆正面不可见、且车辆右侧面可见的第四中间类别;车辆后面可见、车辆正面不可见、且车辆左侧面可见的第五中间类别;车辆正面可见、车辆后面不可见、且车辆侧面不可见的第六中间类别;车辆正面可见、车辆后面不可见、且车辆右侧面可见的第七中间类别、车辆正面可见、车辆后面不可见、且车辆左侧面可见的第八中间类别。4. The method according to claim 2 or 3 is characterized in that, when the orientation information includes second orientation information, the second orientation information includes: a first intermediate category in which the front of the vehicle and the rear of the vehicle are not visible, and the left side of the vehicle is visible; a second intermediate category in which the front of the vehicle and the rear of the vehicle are not visible, and the right side of the vehicle is visible; a third intermediate category in which the rear of the vehicle is visible and the front of the vehicle is not visible, and the side of the vehicle is not visible; a fourth intermediate category in which the rear of the vehicle is visible, the front of the vehicle is not visible, and the right side of the vehicle is visible; a fifth intermediate category in which the rear of the vehicle is visible, the front of the vehicle is not visible, and the left side of the vehicle is visible; a sixth intermediate category in which the front of the vehicle is visible, the rear of the vehicle is not visible, and the side of the vehicle is not visible; a seventh intermediate category in which the front of the vehicle is visible, the rear of the vehicle is not visible, and the right side of the vehicle is visible; and an eighth intermediate category in which the front of the vehicle is visible, the rear of the vehicle is not visible, and the left side of the vehicle is visible. 5.根据权利要求2所述的方法,其特征在于,在所述属性信息包括所述分界点的第一位置信息的情况下,获取所述样本图像的二维标注数据,包括:5. The method according to claim 2, wherein, when the attribute information includes the first position information of the demarcation point, obtaining the two-dimensional annotation data of the sample image comprises: 获取所述分界点在所述样本图像对应的图像坐标系的水平方向的坐标信息;Obtaining coordinate information of the demarcation point in the horizontal direction of the image coordinate system corresponding to the sample image; 将所述目标车辆的检测框信息指示的垂直方向的坐标信息,确定为所述分界点在所述样本图像对应的图像坐标系的垂直方向的坐标信息。The vertical coordinate information indicated by the detection frame information of the target vehicle is determined as the vertical coordinate information of the dividing point in the image coordinate system corresponding to the sample image. 6.根据权利要求2所述的方法,其特征在于,在所述属性信息包括所述接触点的第二位置信息的情况下,获取所述样本图像的二维标注数据,包括:6. The method according to claim 2, wherein, when the attribute information includes the second position information of the contact point, obtaining the two-dimensional annotation data of the sample image comprises: 获取所述接触点在所述样本图像对应的图像坐标系的水平方向的坐标信息;将所述目标车辆的检测框信息指示的垂直方向的坐标信息,确定为所述接触点在所述样本图像对应的图像坐标系的垂直方向的坐标信息;和/或,Acquire the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image; determine the coordinate information of the vertical direction indicated by the detection frame information of the target vehicle as the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image; and/or, 获取所述接触点在所述样本图像对应的图像坐标系的垂直方向的坐标信息;将所述目标车辆的检测框信息指示的水平方向的坐标信息,确定为所述接触点在所述样本图像对应的图像坐标系的水平方向的坐标信息。Obtain the coordinate information of the contact point in the vertical direction of the image coordinate system corresponding to the sample image; determine the coordinate information of the horizontal direction indicated by the detection frame information of the target vehicle as the coordinate information of the contact point in the horizontal direction of the image coordinate system corresponding to the sample image. 7.一种图像检测方法,其特征在于,所述方法包括:7. An image detection method, characterized in that the method comprises: 获取待检测图像;Acquire the image to be detected; 将所述待检测图像输入至训练后的、包括多个分支网络的目标神经网络,得到所述多个分支网络并行输出的所述待检测图像中的车辆的二维检测框信息,以及能够表征所述车辆的三维位姿的至少一种属性信息;Inputting the image to be detected into a trained target neural network including a plurality of branch networks, and obtaining two-dimensional detection frame information of the vehicle in the image to be detected outputted in parallel by the plurality of branch networks, and at least one attribute information capable of characterizing the three-dimensional posture of the vehicle; 基于所述车辆的二维检测框信息以及所述至少一种属性信息,确定所述待检测图像的检测结果;Determining a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information; 所述属性信息包括以下至少一种:目标车辆的相邻可见面之间的分界线上的任一分界点的第一位置信息、目标车辆的至少一个可见车轮与地面之间的接触点的第二位置信息。The attribute information includes at least one of the following: first position information of any boundary point on a boundary line between adjacent visible surfaces of the target vehicle, and second position information of a contact point between at least one visible wheel of the target vehicle and the ground. 8.根据权利要求7所述的方法,其特征在于,基于所述车辆的二维检测框信息以及所述至少一种属性信息,确定所述待检测图像的检测结果,包括:8. The method according to claim 7, characterized in that determining the detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information comprises: 基于所述车辆的二维检测框信息以及所述至少一种属性信息,确定所述车辆的深度信息。Determine depth information of the vehicle based on the two-dimensional detection frame information of the vehicle and the at least one attribute information. 9.根据权利要求7所述的方法,其特征在于,基于所述车辆的二维检测框信息以及所述至少一种属性信息,确定所述待检测图像的检测结果,包括:9. The method according to claim 7, characterized in that determining the detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information comprises: 基于所述待检测图像、所述待检测图像中包括的车辆的二维检测框信息、以及所述车辆对应的至少一种属性信息,确定所述车辆的三维检测数据。Based on the image to be detected, two-dimensional detection frame information of the vehicle included in the image to be detected, and at least one attribute information corresponding to the vehicle, three-dimensional detection data of the vehicle is determined. 10.根据权利要求9所述的方法,其特征在于,在所述至少一种属性信息包括所述车辆的相邻可见面之间的分界线上的任一分界点的第一位置信息、和所述车辆的至少一个可见车轮与地面之间的接触点的第二位置信息的情况下,所述基于所述待检测图像、所述待检测图像中包括的车辆的二维检测框信息、以及所述车辆对应的至少一种属性信息,确定所述车辆的三维检测数据,包括:10. The method according to claim 9, characterized in that, when the at least one attribute information includes first position information of any boundary point on a boundary line between adjacent visible surfaces of the vehicle and second position information of a contact point between at least one visible wheel of the vehicle and the ground, determining the three-dimensional detection data of the vehicle based on the image to be detected, the two-dimensional detection frame information of the vehicle included in the image to be detected, and the at least one attribute information corresponding to the vehicle, comprises: 基于所述分界点的第一位置信息、所述接触点的第二位置信息和所述车辆的二维检测框信息,确定表征所述车辆的单个平面的二维紧致框信息;所述二维紧致框为从所述车辆的二维检测框中确定的、车辆的单个平面对应的检测框;Based on the first position information of the demarcation point, the second position information of the contact point and the two-dimensional detection frame information of the vehicle, determine the two-dimensional compact frame information representing the single plane of the vehicle; the two-dimensional compact frame is a detection frame corresponding to the single plane of the vehicle determined from the two-dimensional detection frame of the vehicle; 基于所述二维紧致框信息、和所述待检测图像,确定所述车辆的三维检测数据。Based on the two-dimensional compact frame information and the image to be detected, three-dimensional detection data of the vehicle is determined. 11.一种行驶控制方法,其特征在于,包括:11. A driving control method, comprising: 获取行驶装置在行驶过程中采集的道路图像;Acquire road images collected by the driving device during driving; 利用权利要求1至6任一项所述的神经网络训练方法训练得到的目标神经网络,对所述道路图像进行检测,得到所述道路图像中包括的目标车辆的目标检测数据;Detecting the road image using a target neural network trained by the neural network training method according to any one of claims 1 to 6 to obtain target detection data of a target vehicle included in the road image; 基于所述道路图像中包括的目标车辆的目标检测数据,控制所述行驶装置。The travel device is controlled based on the object detection data of the object vehicle included in the road image. 12.一种神经网络训练装置,其特征在于,包括:12. A neural network training device, comprising: 第一获取模块,用于获取样本图像,以及所述样本图像的二维标注数据;所述二维标注数据包括所述样本图像中目标车辆的检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息;A first acquisition module is used to acquire a sample image and two-dimensional annotation data of the sample image; the two-dimensional annotation data includes detection frame information of a target vehicle in the sample image and at least one attribute information capable of characterizing a three-dimensional posture of the target vehicle; 训练模块,用于基于所述样本图像和所述二维标注数据,训练包括多个分支网络的目标神经网络;A training module, used for training a target neural network including a plurality of branch networks based on the sample image and the two-dimensional annotation data; 其中,在将所述样本图像输入到所述目标神经网络后,每个分支网络分别输出以下信息中的一个:所述目标车辆的二维检测框信息,以及能够表征所述目标车辆的三维位姿的至少一种属性信息;After the sample image is input into the target neural network, each branch network outputs one of the following information: two-dimensional detection frame information of the target vehicle, and at least one attribute information capable of characterizing the three-dimensional posture of the target vehicle; 所述属性信息包括以下至少一种:所述目标车辆的相邻可见面之间的分界线上的任一分界点的第一位置信息、所述目标车辆的至少一个可见车轮与地面之间的接触点的第二位置信息。The attribute information includes at least one of the following: first position information of any boundary point on a boundary line between adjacent visible surfaces of the target vehicle, and second position information of a contact point between at least one visible wheel of the target vehicle and the ground. 13.一种图像检测装置,其特征在于,所述装置包括:13. An image detection device, characterized in that the device comprises: 第二获取模块,用于获取待检测图像;A second acquisition module is used to acquire the image to be detected; 第一生成模块,用于将所述待检测图像输入至训练后的、包括多个分支网络的目标神经网络,得到所述多个分支网络并行输出的所述待检测图像中的车辆的二维检测框信息,以及能够表征所述车辆的三维位姿的至少一种属性信息;A first generating module is used to input the image to be detected into a trained target neural network including multiple branch networks, and obtain two-dimensional detection frame information of the vehicle in the image to be detected outputted in parallel by the multiple branch networks, and at least one attribute information capable of characterizing the three-dimensional posture of the vehicle; 确定模块,用于基于所述车辆的二维检测框信息以及所述至少一种属性信息,确定所述待检测图像的检测结果;A determination module, configured to determine a detection result of the image to be detected based on the two-dimensional detection frame information of the vehicle and the at least one attribute information; 所述属性信息包括以下至少一种:目标车辆的相邻可见面之间的分界线上的任一分界点的第一位置信息、目标车辆的至少一个可见车轮与地面之间的接触点的第二位置信息。The attribute information includes at least one of the following: first position information of any boundary point on a boundary line between adjacent visible surfaces of the target vehicle, and second position information of a contact point between at least one visible wheel of the target vehicle and the ground. 14.一种行驶控制装置,其特征在于,包括:14. A driving control device, comprising: 第三获取模块,用于获取行驶装置在行驶过程中采集的道路图像;A third acquisition module is used to acquire road images collected by the driving device during driving; 第二生成模块,用于利用权利要求1至6任一项所述的神经网络训练方法训练得到的目标神经网络,对所述道路图像进行检测,得到所述道路图像中包括的目标车辆的目标检测数据;A second generating module, configured to detect the road image using a target neural network trained by the neural network training method according to any one of claims 1 to 6, and obtain target detection data of a target vehicle included in the road image; 控制模块,用于基于所述道路图像中包括的目标车辆的目标检测数据,控制所述行驶装置。A control module is used to control the travel device based on the target detection data of the target vehicle included in the road image. 15.一种电子设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至6任一所述的神经网络训练方法的步骤;或者执行如权利要求7至10任一所述的图像检测方法的步骤;或者执行权利要求11所述的行驶控制方法的步骤。15. An electronic device, characterized in that it comprises: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, when the electronic device is running, the processor and the memory communicate through the bus, and when the machine-readable instructions are executed by the processor, the steps of the neural network training method as described in any one of claims 1 to 6 are executed; or the steps of the image detection method as described in any one of claims 7 to 10 are executed; or the steps of the driving control method as described in claim 11 are executed. 16.一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至6任一所述的神经网络训练方法的步骤;或者执行如权利要求7至10任一所述的图像检测方法的步骤;或者执行权利要求11所述的行驶控制方法的步骤。16. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program executes the steps of the neural network training method as described in any one of claims 1 to 6; or executes the steps of the image detection method as described in any one of claims 7 to 10; or executes the steps of the driving control method as described in claim 11.
CN202110713234.0A 2021-06-25 2021-06-25 Neural network training, image detection, driving control method and device Active CN113435318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110713234.0A CN113435318B (en) 2021-06-25 2021-06-25 Neural network training, image detection, driving control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110713234.0A CN113435318B (en) 2021-06-25 2021-06-25 Neural network training, image detection, driving control method and device

Publications (2)

Publication Number Publication Date
CN113435318A CN113435318A (en) 2021-09-24
CN113435318B true CN113435318B (en) 2025-02-25

Family

ID=77755330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110713234.0A Active CN113435318B (en) 2021-06-25 2021-06-25 Neural network training, image detection, driving control method and device

Country Status (1)

Country Link
CN (1) CN113435318B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115393803A (en) * 2022-08-30 2022-11-25 京东方科技集团股份有限公司 Vehicle violation detection method, device and system, and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427797A (en) * 2019-05-28 2019-11-08 东南大学 A kind of three-dimensional vehicle detection method based on geometrical condition limitation
CN112926395A (en) * 2021-01-27 2021-06-08 上海商汤临港智能科技有限公司 Target detection method and device, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3607489B1 (en) * 2017-04-04 2023-05-24 Robert Bosch GmbH Direct vehicle detection as 3d bounding boxes using neural network image processing
CN109214980B (en) * 2017-07-04 2023-06-23 阿波罗智能技术(北京)有限公司 Three-dimensional attitude estimation method, three-dimensional attitude estimation device, three-dimensional attitude estimation equipment and computer storage medium
JP2019096072A (en) * 2017-11-22 2019-06-20 株式会社東芝 Object detection device, object detection method and program
CN108875902A (en) * 2017-12-04 2018-11-23 北京旷视科技有限公司 Neural network training method and device, vehicle detection estimation method and device, storage medium
CN110390258A (en) * 2019-06-05 2019-10-29 东南大学 Annotating Method of 3D Information of Image Object
CN110517349A (en) * 2019-07-26 2019-11-29 电子科技大学 A 3D Vehicle Target Detection Method Based on Monocular Vision and Geometric Constraints
CN112766206B (en) * 2021-01-28 2024-05-28 深圳市捷顺科技实业股份有限公司 High-order video vehicle detection method and device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427797A (en) * 2019-05-28 2019-11-08 东南大学 A kind of three-dimensional vehicle detection method based on geometrical condition limitation
CN112926395A (en) * 2021-01-27 2021-06-08 上海商汤临港智能科技有限公司 Target detection method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113435318A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
WO2021196941A1 (en) Method and apparatus for detecting three-dimensional target
US10891795B2 (en) Localization method and apparatus based on 3D color map
CN113673438B (en) A collision warning method, device, electronic equipment and storage medium
CN108345838A (en) Automatic traffic lamp detection model is trained using analog image
CN111539484B (en) Method and device for training neural network
JP2014027481A (en) Drive video recording device and drive video recording system
CN112926461B (en) Neural network training, driving control method and device
JP2011215052A (en) Own-vehicle position detection system using scenic image recognition
CN113011364B (en) Neural network training, target object detection and driving control method and device
CN111928842B (en) Monocular vision based SLAM positioning method and related device
US12154349B2 (en) Method for detecting three-dimensional objects in roadway and electronic device
WO2022082571A1 (en) Lane line detection method and apparatus
WO2023138537A1 (en) Image processing method and apparatus, terminal device and storage medium
CN113139567B (en) Information processing apparatus, control method thereof, vehicle, recording medium, information processing server, information processing method, and program
CN117271687A (en) Track playback method, track playback device, electronic equipment and storage medium
CN113435318B (en) Neural network training, image detection, driving control method and device
US12157499B2 (en) Assistance method of safe driving and electronic device
US20210064872A1 (en) Object detecting system for detecting object by using hierarchical pyramid and object detecting method thereof
US12260655B2 (en) Method for detection of three-dimensional objects and electronic device
CN112654997B (en) Lane line detection method and device
CN112184605A (en) Method, equipment and system for enhancing vehicle driving visual field
US20230419682A1 (en) Method for managing driving and electronic device
CN116977810B (en) Multi-modal post-fusion long-tail category detection method and system
CN115131976B (en) Image processing apparatus and method
CN117842055A (en) Wrong driving behavior reminding method and device, vehicle-mounted terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant