WO2024146365A1 - Object detection method and apparatus and storage medium - Google Patents
Object detection method and apparatus and storage medium Download PDFInfo
- Publication number
- WO2024146365A1 WO2024146365A1 PCT/CN2023/139598 CN2023139598W WO2024146365A1 WO 2024146365 A1 WO2024146365 A1 WO 2024146365A1 CN 2023139598 W CN2023139598 W CN 2023139598W WO 2024146365 A1 WO2024146365 A1 WO 2024146365A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- space
- ray
- coordinate system
- detection
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 131
- 238000003384 imaging method Methods 0.000 claims abstract description 80
- 238000000034 method Methods 0.000 claims abstract description 44
- 230000003287 optical effect Effects 0.000 claims abstract description 43
- 238000004590 computer program Methods 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 11
- 230000000007 visual effect Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 241000283070 Equus zebra Species 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000002939 conjugate gradient method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0014—Image feed-back for automatic industrial control, e.g. robot with camera
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Definitions
- the embodiments of the present disclosure relate to the field of visual detection technology, and in particular to a target detection method, a target detection device, and a storage medium.
- 3D object detection algorithms based on pure vision are an important research direction in the field of object detection.
- pure vision detection schemes generally have large errors in depth estimation, which leads to inaccurate 3D position estimation.
- Recently and industry have tried to use multi-cameras to improve the accuracy of depth estimation, but the improvement effect of multi-cameras on depth measurement is not obvious, which makes 3D position estimation still inaccurate.
- an embodiment of the present disclosure provides a target detection method, the method comprising:
- a first position of the target in the 3D space is determined according to the target ray corresponding to the target.
- an embodiment of the present disclosure provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor enables the processor to implement the target detection method as described above.
- FIG1 is a schematic diagram of a flow chart of an embodiment of a target detection method disclosed herein;
- FIG2 is a flow chart of another embodiment of the target detection method disclosed herein;
- FIG9 is a schematic diagram showing the principle of solving the position of a target by using multiple target rays in the target detection method disclosed in the present invention.
- module means, “component” or “unit” used to represent elements are only used to facilitate the description of the present disclosure, and have no special meanings. Therefore, “module”, “component” or “unit” can be used in a mixed manner.
- the embodiments of the present disclosure provide a target detection method, a target detection device, and a storage medium, which obtain a detection image of a camera device, wherein the detection image includes an image corresponding to a target; determine the position of an imaging point of the target on an image plane of the camera device according to the detection image; determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device, wherein the target ray passes through the position of the imaging point of the target and the position of the optical center of the camera device; and determine a first position of the target in 3D space according to the target ray corresponding to the target.
- the embodiments of the present disclosure avoid the technical idea of estimating the depth, which has a technical bias. Instead, the position of the imaging point of the target on the image plane of the camera device is determined based on the detected image, and the target ray passing through the position of the imaging point of the target and the position of the optical center of the camera device is determined. The first position of the target in the 3D space is determined based on the target ray.
- the camera device of the embodiment of the present disclosure may be a camera device under a fixed viewing angle such as a road test or a factory building.
- the determination of the first position of the target in the 3D space based on the multiple target rays corresponding to the target may also be carried out in other ways.
- the point in the space with the shortest distance to each target ray may be directly solved, and this point is the first position of the target in the 3D space; or, in some embodiments, multiple positions are obtained based on each target ray and the plane where the target is located, and then the position with the shortest distance to each target ray is selected from these positions as the first position of the target in the 3D space; and so on.
- the first position of the target in the 3D space can be determined based on a target ray corresponding to the target and the plane where the target is located.
- Sub-step S104B2 taking the position of the intersection of the target ray and the plane where the target is located as the first position of the target in the 3D space.
- the target ray and the plane where the target is located (such as the ground) will theoretically intersect, so the intersection point of the target ray and the plane where the target is located is solved, and the position of the intersection point is the first position of the target in the 3D space.
- the first position of the target in the 3D space can also be determined based on a target ray corresponding to the target and the plane where the target is located. For example, in some embodiments, if the target is on a specific straight line in the plane (such as a pedestrian on a zebra crossing on the ground), then the first position of the target in the 3D space can also be determined based on a target ray corresponding to the target and the straight line; and so on.
- the first position of the target in the 3D space exists objectively, but the representation method of the first position of the target in the 3D space is related to the selected coordinate system. In different coordinate systems, the representation method of the first position of the target in the 3D space is different.
- step S104 determining the first position of the target in the 3D space according to the target ray corresponding to the target, may include: determining the first position of the target in the world coordinate system according to the target ray corresponding to the target.
- the camera device can be placed at any position in the environment.
- a reference coordinate system is selected in the environment to describe the position of the camera device and any target (object) in the environment.
- This coordinate system is called the world coordinate system.
- Coordinate systems related to the world coordinate system also include the camera coordinate system, the pixel coordinate system, and the retinal coordinate system.
- the detection image (i.e., digital image) captured by the camera device can be stored as an array in the computer.
- the value of each element (pixel) in the array is the brightness (grayscale) of the image point; a rectangular coordinate system u-v is defined on the image, and the coordinates (u, v) of each pixel are the column number and row number of the pixel in the array respectively; therefore, (u, v) is the coordinate of the image coordinate system (also called pixel coordinate system) with pixels as the unit.
- the image coordinate system only indicates the number of columns and rows of the pixel in the digital image, and does not use physical units to indicate the physical position of the pixel in the image, it is necessary to establish an image plane coordinate system xy expressed in physical units (e.g. centimeters); (x, y) is used to represent the coordinates of the image plane coordinate system measured in physical units.
- the origin is defined at the intersection of the optical axis of the camera device and the image plane, which is called the principal point of the image. This point is generally located at the center of the image, and its coordinates in the image coordinate system are (u0, v0).
- the physical dimensions of each pixel in the x-axis and y-axis directions are dx and dy, and the relationship between the two coordinate systems is as follows:
- s' represents a skew factor caused by the non-orthogonality of the image plane coordinate axes of the camera device.
- the origin of the camera coordinate system is the optical center of the camera device. Its x-axis and y-axis are parallel to the X and Y axes of the image.
- the z-axis is the optical axis of the camera device, which is perpendicular to the image plane.
- the spatial rectangular coordinate system formed by this is called the camera coordinate system.
- the camera coordinate system is a three-dimensional coordinate system. The intersection of the optical axis and the image plane is the origin of the image coordinate system.
- the rectangular coordinate system formed by the origin of the image coordinate system and the X and Y axes of the image is the image coordinate system.
- the image coordinate system is a two-dimensional coordinate system. The relationship between the camera coordinate system and the world coordinate system can be described by the rotation matrix R and the translation vector t.
- step S103 determining the target ray corresponding to the target based on the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device, may also include: sub-step S1031, sub-step S1032 and sub-step S1033, as shown in Figure 4.
- Sub-step S1031 determining the first coordinate of the imaging point of the target in the camera coordinate system according to the position of the imaging point of the target on the image plane of the imaging device and the internal parameter formula of the imaging device.
- the position of the imaging point of the target on the image plane of the camera device can be understood as the coordinates of the imaging point of the target in the image plane coordinate system.
- the internal parameter formula of the camera device is required. If the coordinates of the target O i detected in the camera device Ak in the image plane coordinate system are (x i , y i , 1) k , according to the internal parameter formula K of the camera device, the following relationship can be obtained:
- Zi Since the depth is not required in the subsequent target ray calculation, Zi can be eliminated on both sides of the equation, or Zi can be substituted as an arbitrary constant. If the elimination method is used, the following relationship can be obtained:
- the second coordinate of the optical center of the camera device in the camera coordinate system is the origin of the camera coordinate system (0,0,0) k .
- the external parameter matrix of the camera device can be used to obtain the first coordinate of the imaging point of the target in the camera coordinate system (X i ',Y i ',1) k and the origin of the camera coordinate system (0,0,0) k in the world coordinate system.
- the coordinates of the first position of the final target in the world coordinate system are:
- the disclosed embodiment can determine the second position of the target in the 3D space (ie, not the position of the target in the 3D space according to the target ray) based on the detection image according to current related technologies (including a 3D target detection algorithm based on pure vision).
- Step S108 Determine a first weight corresponding to the first position and a second weight corresponding to the second position according to the first confidence or first variance and the second confidence or second variance, wherein the sum of the first weight and the second weight is equal to 1.
- step S106 determining the final position of the target in the 3D space according to the first position and the second position, may also include: determining the final position of the target in the 3D space according to the first position, the second position, the first weight and the second weight.
- Confidence is also called reliability, confidence level, or confidence coefficient. It is a probability value. That is, when sampling is used to estimate the population parameter, the conclusion is always uncertain due to the randomness of the sample. Therefore, a probability statement method is used, which is the interval estimation method in mathematical statistics. That is, the corresponding probability that the estimated value and the population parameter are within a certain allowable error range is called confidence (or the probability that the population parameter value falls within a certain range of the sample statistical value). Confidence is one of the important indicators to describe uncertainty. Confidence indicates the degree of certainty of interval estimation.
- the first confidence level can be the reliability corresponding to the first position
- the second confidence level can be the reliability corresponding to the second position.
- variance is used to measure the degree of deviation between a random variable and its mathematical expectation (i.e., mean).
- the first variance can be the degree of deviation corresponding to the first position
- the second variance can be the degree of deviation corresponding to the second position. If there are multiple cameras, confidence can be used, and if there is only one camera, variance can be used.
- Module B It is the target matching module, which matches the target data output by modules A1-An, that is, finds the data corresponding to the same target in each visual inspection module.
- the first step, module A1-An, visual detection module takes the detection image as input, and outputs the 2D detection result and corresponding confidence of the target in the detection image, as well as its 3D rough detection result (i.e., the second position) and corresponding confidence of the target.
- the third step, module C, the target ray position estimation module uses the position of the imaging point of the target under the perspective of multiple cameras and the position of the optical center of the camera to construct multiple target rays, and calculate the first position of the target in the 3D space.
- the principle is shown in Figure 9: Specifically, the target ray from the position of the imaging point of the target and the position of the optical center of the camera is constructed, and the equation of the target ray in the world coordinate system is obtained; after obtaining at least two target rays according to the results of different cameras, the intersection of these target rays or the minimum distance point with the minimum sum of distances to these target rays is calculated, and the intersection or the minimum distance point is used as the first position of the target in the 3D space.
- the coordinates P i are solved by a numerical solution of analytic geometry.
- module D inputs the confidence of the 2D detection result obtained by module A1-An and the confidence of the 3D rough detection result into the trained neural network to obtain the first weight and the second weight.
- the first position of the target in the 3D space obtained by module C and the 3D rough detection result (i.e., the second position) obtained by module A1-An are weighted and summed to obtain the final result, i.e., the final position of the target in the 3D space.
- Embodiment 2 A combination solution of a 3D target detection system based on coarse depth estimation and multi-camera target rays, the structure of which is shown in FIG10 .
- the first step, module A1-An the visual detection module, takes the detection image as input and outputs the 2D detection result and corresponding confidence of the target in the detection image, as well as the rough depth estimation.
- module D inputs the confidence of the 2D detection results obtained by modules A1-An and the confidence of the 3D rough detection results obtained by module B using rough depth estimation into the trained neural network to obtain the first weight and the second weight.
- the first position of the target in the 3D space obtained by module C and the 3D rough detection result obtained by module B using rough depth estimation are weighted and summed to obtain the final result, that is, the final position of the target in the 3D space.
- Embodiment 3 A combination solution of a 3D target detection system based on ground equations and monocular camera target rays, wherein the The structure is shown in Figure 11.
- the first step is to use module A, the visual detection module, which takes the detection image as input and outputs the 2D detection result and the corresponding variance of the target in the detection image, as well as the 3D rough detection result and the corresponding variance.
- the second step, module C, the ray position estimation module uses the position of the imaging point of the target under the viewing angle of a camera device and the position of the optical center of the camera device to construct a target ray, and combines it with the ground equation to calculate the first position of the target in the 3D space.
- the principle is shown in Figure 12: Specifically, the target ray is constructed from the position of the imaging point of the target and the position of the optical center of the camera device, and the equation of the target ray in the world coordinate system is obtained; then the equation of the target ray in the world coordinate system is combined with the ground equation to solve the first position of the target in the 3D space.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Robotics (AREA)
- Probability & Statistics with Applications (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
Provided in the embodiments of the present disclosure are an object detection method and apparatus and a storage medium. The method comprises: acquiring a detection image from a camera apparatus, the detection image comprising an image corresponding to an object; according to the detection image, determining the position of the imaging point of the object on the image plane of the camera apparatus; according to the position of the imaging point of the object on the image plane of the camera apparatus and the position of the optical center of the camera apparatus, determining an object ray corresponding to the object, the object ray passing through the position of the imaging point of the object and the position of the optical center of the camera apparatus; and, according to the object ray corresponding to the object, determining a first position of the object in a 3D space.
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本公开基于2023年1月3日提交的发明名称为“目标检测方法、装置和存储介质”的中国专利申请202310002393.9,并且要求该专利申请的优先权,通过引用将其所公开的内容全部并入本公开。The present disclosure is based on Chinese patent application 202310002393.9 filed on January 3, 2023, with the invention name “Target Detection Method, Device and Storage Medium”, and claims the priority of the patent application, and all the contents disclosed therein are incorporated into the present disclosure by reference.
本公开实施例涉及视觉检测技术领域,尤其涉及一种目标检测方法、目标检测装置和存储介质。The embodiments of the present disclosure relate to the field of visual detection technology, and in particular to a target detection method, a target detection device, and a storage medium.
基于纯视觉的3D目标检测算法是目标检测领域的重要研究方向,但是纯视觉的检测方案对深度估计普遍存在较大误差,进而导致3D位置估计不准确。为了改善这一缺点,学术界和业界尝试采用多目相机来提高深度估计的精度,但是多目相机对深度测量的改进效果不明显,这使得3D位置估计依然不准确。3D object detection algorithms based on pure vision are an important research direction in the field of object detection. However, pure vision detection schemes generally have large errors in depth estimation, which leads to inaccurate 3D position estimation. In order to improve this shortcoming, academia and industry have tried to use multi-cameras to improve the accuracy of depth estimation, but the improvement effect of multi-cameras on depth measurement is not obvious, which makes 3D position estimation still inaccurate.
发明内容Summary of the invention
基于此,本公开实施例提供一种目标检测方法、目标检测装置及存储介质,能够提高目标的3D位置的估计准确性。Based on this, the embodiments of the present disclosure provide a target detection method, a target detection device and a storage medium, which can improve the estimation accuracy of the 3D position of the target.
第一方面,本公开实施例提供一种目标检测方法,所述方法包括:In a first aspect, an embodiment of the present disclosure provides a target detection method, the method comprising:
获取摄像装置的检测图像,所述检测图像包括目标对应的图像;Acquire a detection image of the camera device, wherein the detection image includes an image corresponding to the target;
根据所述检测图像确定所述目标的成像点在所述摄像装置的像平面上的位置;Determining the position of the imaging point of the target on the image plane of the camera device according to the detection image;
根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,所述目标射线通过所述目标的成像点的位置和所述摄像装置的光心的位置;Determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the imaging device and the position of the optical center of the imaging device, wherein the target ray passes through the position of the imaging point of the target and the position of the optical center of the imaging device;
根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置。A first position of the target in the 3D space is determined according to the target ray corresponding to the target.
第二方面,本公开实施例提供一种目标检测装置,所述装置包括存储器以及处理器,所述存储器设置为存储计算机程序;所述处理器设置为执行所述计算机程序并在执行所述计算机程序时实现如上所述的目标检测方法。In a second aspect, an embodiment of the present disclosure provides a target detection device, which includes a memory and a processor, wherein the memory is configured to store a computer program; the processor is configured to execute the computer program and implement the target detection method as described above when executing the computer program.
第三方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上所述的目标检测方法。In a third aspect, an embodiment of the present disclosure provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor enables the processor to implement the target detection method as described above.
本公开实施例提供了一种目标检测方法、目标检测装置及存储介质,获取摄像装置的检测图像,所述检测图像包括目标对应的图像;根据所述检测图像确定所述目标的成像点在所述摄像装置的像平面上的位置;根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,所述目标射线通过所述目标的
成像点的位置和所述摄像装置的光心的位置;根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置。公开The embodiments of the present disclosure provide a target detection method, a target detection device, and a storage medium, which obtain a detection image of a camera device, wherein the detection image includes an image corresponding to a target; determine the position of an imaging point of the target on an image plane of the camera device according to the detection image; determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device, wherein the target ray passes through the target. The position of the imaging point and the position of the optical center of the camera device; and determining the first position of the target in the 3D space according to the target ray corresponding to the target.
图1是本公开目标检测方法一实施例的流程示意图;FIG1 is a schematic diagram of a flow chart of an embodiment of a target detection method disclosed herein;
图2是本公开目标检测方法另一实施例的流程示意图;FIG2 is a flow chart of another embodiment of the target detection method disclosed herein;
图3是本公开目标检测方法又一实施例的流程示意图;FIG3 is a flow chart of another embodiment of the target detection method disclosed herein;
图4是本公开目标检测方法又一实施例的流程示意图;FIG4 is a schematic diagram of a flow chart of another embodiment of the target detection method disclosed herein;
图5是本公开目标检测方法又一实施例的流程示意图;FIG5 is a schematic diagram of a flow chart of another embodiment of the target detection method disclosed herein;
图6是本公开目标检测方法又一实施例的流程示意图;FIG6 is a flow chart of another embodiment of the target detection method disclosed herein;
图7是本公开目标检测方法的应用一实施例的结构示意图;FIG7 is a schematic diagram of the structure of an application example of the target detection method disclosed herein;
图8是本公开目标检测方法中3D粗检测结果与本公开实施例的系统结合的应用一实施例的结构示意图;FIG8 is a schematic diagram of a structure of an embodiment of an application of a 3D rough detection result in a target detection method of the present disclosure combined with a system of an embodiment of the present disclosure;
图9是本公开目标检测方法中多个目标射线求解目标的位置的原理示意图;FIG9 is a schematic diagram showing the principle of solving the position of a target by using multiple target rays in the target detection method disclosed in the present invention;
图10是本公开目标检测方法中深度粗估计与本公开实施例的系统结合的应用一实施例的结构示意图;FIG10 is a schematic diagram of a structure of an embodiment of an application of a combination of a coarse depth estimation method of a target detection method of the present disclosure and a system of an embodiment of the present disclosure;
图11是本公开目标检测方法中地面方程与本公开实施例的系统结合的应用一实施例的结构示意图;FIG11 is a schematic diagram of a structure of an embodiment of an application of a ground equation in a target detection method of the present disclosure combined with a system of an embodiment of the present disclosure;
图12是本公开目标检测方法中目标射线结合地面方程求解目标的位置的原理示意图;FIG12 is a schematic diagram showing the principle of solving the position of a target by combining target rays with ground equations in the target detection method disclosed in the present invention;
图13是本公开目标检测装置一实施例的结构示意图。FIG. 13 is a schematic diagram of the structure of an object detection device according to an embodiment of the present disclosure.
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性的劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。The following will be combined with the drawings in the embodiments of the present disclosure to clearly and completely describe the technical solutions in the embodiments of the present disclosure. Obviously, the described embodiments are only part of the embodiments of the present disclosure, not all of the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present disclosure.
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowcharts shown in the accompanying drawings are only examples and do not necessarily include all the contents and operations/steps, nor must they be executed in the order described. For example, some operations/steps may also be decomposed, combined or partially merged, so the actual execution order may change according to actual conditions.
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本公开的说明,其本身没有特有的意义。因此,“模块”、“部件”或“单元”可以混合地使用。In the subsequent description, the suffixes such as "module", "component" or "unit" used to represent elements are only used to facilitate the description of the present disclosure, and have no special meanings. Therefore, "module", "component" or "unit" can be used in a mixed manner.
在详细介绍本公开实施例之前,先介绍一下相关技术。Before introducing the embodiments of the present disclosure in detail, the related technology is first introduced.
基于纯视觉的3D目标检测算法是目标检测领域的重要研究方向。在工程上,基于纯视觉的目标检测方案因其成本低、对色彩纹理敏感等特点而受到重视。但是由于相机等视觉探测器对于距离没有绝对的测量能力,所以基于纯视觉的目标检测方案对深度的估计普遍存在较大误差,进而导致估计得到的目标的3D位置在精度上难以与激光雷达的目标检测方案竞争。目前基于纯视觉的3D目标检测方案的主流方法是利用目标的2D检测结果,外加预测的目标深度值,通过相机参数得到目标在3D空间中的位置信息,或是直接预测目标的3D位置。不
论哪种方法,由于相机对距离的探测存在先天缺陷,因此通常深度估计误差较大,进而导致3D位置估计不准确。为了改善这一缺点,学术界和业界尝试采用多目相机来提高深度估计的精度。而目前采取的方法多是将多目相机作为深度估计的工具,尽可能的还原图像每个像素点的深度,这导致相机视角差距不能太大,但正是由于相机视角的相关性较大,对深度测量的改进效果不明显。3D target detection algorithms based on pure vision are an important research direction in the field of target detection. In engineering, target detection schemes based on pure vision are valued for their low cost and sensitivity to color and texture. However, since visual detectors such as cameras do not have absolute distance measurement capabilities, target detection schemes based on pure vision generally have large errors in depth estimation, which makes it difficult for the estimated 3D position of the target to compete with lidar target detection schemes in terms of accuracy. The current mainstream method of 3D target detection schemes based on pure vision is to use the 2D detection results of the target, plus the predicted target depth value, to obtain the target's position information in 3D space through camera parameters, or to directly predict the target's 3D position. No Regardless of the method, due to the inherent defects of the camera in detecting distance, the depth estimation error is usually large, which leads to inaccurate 3D position estimation. In order to improve this shortcoming, academia and industry have tried to use multi-cameras to improve the accuracy of depth estimation. The current methods are mostly to use multi-cameras as a tool for depth estimation, and restore the depth of each pixel in the image as much as possible. This results in the camera perspective difference not being too large, but precisely because of the large correlation of the camera perspective, the improvement effect on depth measurement is not obvious.
本公开实施例提供了一种目标检测方法、目标检测装置及存储介质,获取摄像装置的检测图像,所述检测图像包括目标对应的图像;根据所述检测图像确定所述目标的成像点在所述摄像装置的像平面上的位置;根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,所述目标射线通过所述目标的成像点的位置和所述摄像装置的光心的位置;根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置。相较于相关技术中基于纯视觉的检测方案对深度估计进而估计目标的位置,本公开实施例避开对深度进行估计这一带有技术偏见的技术思路,而是根据检测图像确定目标的成像点在摄像装置的像平面上的位置,确定通过目标的成像点的位置和摄像装置的光心的位置的目标射线,根据目标射线确定目标在3D空间的第一位置,由于对目标的成像点在摄像装置的像平面上的位置的估计比对深度的估计更加准确,根据成像原理可知理论上目标必定在成像点、光心所确定的目标射线上,因此根据目标射线能够更加准确地确定目标在3D空间的第一位置。The embodiments of the present disclosure provide a target detection method, a target detection device, and a storage medium, which obtain a detection image of a camera device, wherein the detection image includes an image corresponding to a target; determine the position of an imaging point of the target on an image plane of the camera device according to the detection image; determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device, wherein the target ray passes through the position of the imaging point of the target and the position of the optical center of the camera device; and determine a first position of the target in 3D space according to the target ray corresponding to the target. Compared with the related art that estimates the depth and then estimates the position of the target based on a pure visual detection scheme, the embodiments of the present disclosure avoid the technical idea of estimating the depth, which has a technical bias. Instead, the position of the imaging point of the target on the image plane of the camera device is determined based on the detected image, and the target ray passing through the position of the imaging point of the target and the position of the optical center of the camera device is determined. The first position of the target in the 3D space is determined based on the target ray. Since the estimation of the position of the imaging point of the target on the image plane of the camera device is more accurate than the estimation of the depth, according to the imaging principle, it can be known that theoretically the target must be on the target ray determined by the imaging point and the optical center. Therefore, the first position of the target in the 3D space can be determined more accurately based on the target ray.
下面结合附图对本公开实施例进行详细说明。The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
参见图1,图1是本公开目标检测方法一实施例的流程示意图,所述方法包括:步骤S101、步骤S102、步骤S103以及步骤S104。Referring to FIG. 1 , FIG. 1 is a flow chart of an embodiment of a target detection method disclosed herein, wherein the method comprises: step S101 , step S102 , step S103 , and step S104 .
步骤S101:获取摄像装置的检测图像,所述检测图像包括目标对应的图像。Step S101: Acquire a detection image of a camera device, wherein the detection image includes an image corresponding to a target.
本公开实施例中,检测图像是摄像装置对目标进行拍摄而得到的,因此检测图像包括目标的图像。摄像装置的数量可以是一个,也可以是多个;摄像装置是一个时,检测图像为一个摄像装置的检测图像,摄像装置为多个时,检测图像为多个摄像装置的检测图像。In the disclosed embodiment, the detection image is obtained by photographing the target by a camera device, so the detection image includes an image of the target. The number of camera devices can be one or more; when there is one camera device, the detection image is the detection image of one camera device, and when there are more than one camera devices, the detection image is the detection image of more than one camera devices.
本公开实施例的摄像装置可以是路测、厂房等固定视角下的摄像装置。The camera device of the embodiment of the present disclosure may be a camera device under a fixed viewing angle such as a road test or a factory building.
步骤S102:根据所述检测图像确定所述目标的成像点在所述摄像装置的像平面上的位置。Step S102: determining the position of the imaging point of the target on the image plane of the camera device according to the detection image.
对于一个目标来说,摄像装置拍摄目标得到检测图像,目标的成像点的位置在理论上已经确定。检测图像是在像平面上产生的,因此根据所述检测图像能够很容易、且很准确地确定所述目标的成像点在所述摄像装置的像平面上的位置。For a target, the camera device captures the target to obtain a detection image, and the position of the imaging point of the target is theoretically determined. The detection image is generated on the image plane, so the position of the imaging point of the target on the image plane of the camera device can be easily and accurately determined based on the detection image.
步骤S103:根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,所述目标射线通过所述目标的成像点的位置和所述摄像装置的光心的位置。Step S103: Determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the imaging device and the position of the optical center of the imaging device, wherein the target ray passes through the position of the imaging point of the target and the position of the optical center of the imaging device.
根据摄像装置的小孔成像原理可知,目标的成像点在所述摄像装置的像平面上的位置、摄像装置的光心的位置以及目标在3D空间的位置在理论上呈一条直线,如果能够知道这条直线,那么可以确定目标在3D空间的位置。两点确定一条直线,因此根据目标的成像点在所述摄像装置的像平面上的位置、摄像装置的光心的位置可以确定目标对应的目标射线。According to the pinhole imaging principle of the camera device, the position of the imaging point of the target on the image plane of the camera device, the position of the optical center of the camera device, and the position of the target in the 3D space are theoretically a straight line. If this straight line is known, the position of the target in the 3D space can be determined. Two points determine a straight line, so the target ray corresponding to the target can be determined according to the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device.
步骤S104:根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置。Step S104: determining a first position of the target in the 3D space according to the target ray corresponding to the target.
理论上目标在3D空间的位置一定在目标射线上,因此根据所述目标对应的目标射线,可
以较为准确地确定所述目标在3D空间的第一位置。Theoretically, the position of the target in 3D space must be on the target ray, so according to the target ray corresponding to the target, To more accurately determine the first position of the target in the 3D space.
相较于相关技术中基于纯视觉的检测方案对深度估计进而估计目标的位置,本公开实施例避开对深度进行估计这一带有技术偏见的技术思路,而是根据检测图像确定目标的成像点在摄像装置的像平面上的位置,确定通过目标的成像点的位置和摄像装置的光心的位置的目标射线,根据目标射线确定目标在3D空间的第一位置,由于对目标的成像点在摄像装置的像平面上的位置的估计比对深度的估计更加准确,根据成像原理可知理论上目标必定在成像点、光心所确定的目标射线上,因此根据目标射线能够更加准确地确定目标在3D空间的第一位置,能够提高目标的3D检测精度。Compared with the related art that estimates the depth and then estimates the position of the target based on a pure visual detection scheme, the embodiment of the present disclosure avoids the technical idea of estimating the depth which has a technical bias, and instead determines the position of the imaging point of the target on the image plane of the camera device based on the detected image, determines the target ray passing through the position of the imaging point of the target and the position of the optical center of the camera device, and determines the first position of the target in the 3D space based on the target ray. Since the estimation of the position of the imaging point of the target on the image plane of the camera device is more accurate than the estimation of the depth, according to the imaging principle, it can be known that theoretically the target must be on the target ray determined by the imaging point and the optical center. Therefore, the first position of the target in the 3D space can be more accurately determined based on the target ray, which can improve the 3D detection accuracy of the target.
在一些实施例中,所述摄像装置的数量为多个;步骤S104,所述根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置,还可以包括:根据所述目标对应的多个目标射线,确定所述目标在3D空间的第一位置。In some embodiments, there are multiple cameras; step S104, determining the first position of the target in the 3D space according to the target ray corresponding to the target, may also include: determining the first position of the target in the 3D space according to multiple target rays corresponding to the target.
本公开实施例中,摄像装置的数量是多个时,对应每一个摄像装置,均存在一个目标的成像点的位置和一个摄像装置的光心的位置,因此每个摄像装置可以对应一个目标射线。由于这多个摄像装置均是对目标进行拍摄,很显然理论上目标位于多个目标射线的交点上。因此,根据所述目标对应的多个目标射线可以确定所述目标在3D空间的第一位置。In the embodiment of the present disclosure, when there are multiple cameras, there is a position of an imaging point of a target and a position of an optical center of the camera corresponding to each camera, so each camera can correspond to a target ray. Since these multiple cameras are all shooting the target, it is obvious that the target is theoretically located at the intersection of multiple target rays. Therefore, the first position of the target in 3D space can be determined based on the multiple target rays corresponding to the target.
在一些实施例中,步骤S104,所述根据所述目标对应的多个目标射线,确定所述目标在3D空间的第一位置,还可以包括:子步骤S104A1和子步骤S104A2,如图2所示。In some embodiments, step S104, determining the first position of the target in the 3D space according to the multiple target rays corresponding to the target, may also include: sub-step S104A1 and sub-step S104A2, as shown in FIG. 2 .
子步骤S104A1:求解所述3D空间中到各个所述目标射线的距离之和最小的最小距离点。Sub-step S104A1: finding the minimum distance point in the 3D space where the sum of distances to each of the target rays is the smallest.
子步骤S104A2:将所述最小距离点的位置作为所述目标在3D空间的第一位置。Sub-step S104A2: taking the position of the minimum distance point as the first position of the target in the 3D space.
虽然理论上可以通过求解多个目标射线的交点得到目标在3D空间的第一位置,但是由于根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置实际确定的目标射线与理论上的目标射线可能存在偏差,即实际确定的目标射线与理论上的目标射线可能并不完全一致,因此多个目标射线在3D空间可能并不一定会相交。此时求解目标在3D空间的第一位置的方法可以采用最小距离和法,即不管多个目标射线在3D空间是否会相交,但是空间中的目标到每个目标射线的距离之和是最小的,因此求解所述3D空间中到各个所述目标射线的距离之和最小的最小距离点,该最小距离点的位置即为所述目标在3D空间的第一位置。该确定所述目标在3D空间的第一位置的方式简单方便,且求解结果更加准确。Although theoretically the first position of the target in 3D space can be obtained by solving the intersection of multiple target rays, the target rays actually determined based on the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device may deviate from the theoretical target rays, that is, the actually determined target rays may not be completely consistent with the theoretical target rays, so multiple target rays may not necessarily intersect in 3D space. At this time, the method for solving the first position of the target in 3D space can adopt the minimum distance sum method, that is, regardless of whether multiple target rays will intersect in 3D space, the sum of the distances from the target in space to each target ray is the smallest, so the minimum distance point in the 3D space where the sum of the distances to each target ray is the smallest is solved, and the position of the minimum distance point is the first position of the target in 3D space. This method of determining the first position of the target in 3D space is simple and convenient, and the solution result is more accurate.
需要说明的是,所述根据所述目标对应的多个目标射线确定所述目标在3D空间的第一位置还可以采用其他方式,例如,在一些实施例中,也可以直接求解空间中到每个目标射线的距离最小的点,该点即为目标在3D空间的第一位置;或者,在一些实施例中,根据每个目标射线与目标所在的平面求解得到多个位置,再从这些位置中选择到每个目标射线的距离最小的位置作为目标在3D空间的第一位置;等等。It should be noted that the determination of the first position of the target in the 3D space based on the multiple target rays corresponding to the target may also be carried out in other ways. For example, in some embodiments, the point in the space with the shortest distance to each target ray may be directly solved, and this point is the first position of the target in the 3D space; or, in some embodiments, multiple positions are obtained based on each target ray and the plane where the target is located, and then the position with the shortest distance to each target ray is selected from these positions as the first position of the target in the 3D space; and so on.
在一些实施例中,子步骤S104A2,所述将所述最小距离点的位置作为所述目标在3D空间的第一位置,还可以包括:将多个所述目标射线的交点的位置作为所述目标在3D空间的第一位置。如果实际确定的目标射线与理论上的目标射线没有偏差,即实际确定的目标射线与理论上的目标射线完全一致,则多个目标射线在空间中会相交,相交的交点到各个目标射线的距离为0,距离之和也为0,此时多个所述目标射线的交点的位置即为所述目标在3D空间的第一位置。In some embodiments, sub-step S104A2, taking the position of the minimum distance point as the first position of the target in the 3D space, may also include: taking the position of the intersection of the plurality of target rays as the first position of the target in the 3D space. If the target ray actually determined has no deviation from the theoretical target ray, that is, the target ray actually determined is completely consistent with the theoretical target ray, then the plurality of target rays will intersect in space, and the distance from the intersection point to each target ray is 0, and the sum of the distances is also 0. At this time, the position of the intersection of the plurality of target rays is the first position of the target in the 3D space.
在一些实施例中,所述摄像装置的数量为一个;步骤S104,所述根据所述目标对应的目
标射线,确定所述目标在3D空间的第一位置,可以包括:根据所述目标对应的一个目标射线和所述目标所在的平面,确定所述目标在3D空间的第一位置。In some embodiments, the number of the camera device is one; step S104, the target corresponding to the target The method of using a target ray to determine a first position of the target in the 3D space may include: determining the first position of the target in the 3D space according to a target ray corresponding to the target and a plane where the target is located.
本公开实施例中,摄像装置的数量是一个时,对应的目标射线也是一个,目标在空间中所在的平面(例如,目标所在的平面为地面)已知,那么根据所述目标对应的一个目标射线和所述目标所在的平面即可确定所述目标在3D空间的第一位置。In the embodiment of the present disclosure, when the number of camera devices is one, the corresponding target ray is also one, and the plane where the target is located in the space (for example, the plane where the target is located is the ground) is known, then the first position of the target in the 3D space can be determined based on a target ray corresponding to the target and the plane where the target is located.
在一些实施例中,步骤S104,所述根据所述目标对应的一个目标射线和所述目标所在的平面,确定所述目标在3D空间的第一位置,还可以包括:子步骤S104B1和子步骤S104B2,如图3所示。In some embodiments, step S104, determining the first position of the target in the 3D space according to a target ray corresponding to the target and the plane where the target is located, may also include: sub-step S104B1 and sub-step S104B2, as shown in FIG. 3 .
子步骤S104B1:求解所述目标射线与所述目标所在的平面的交点。Sub-step S104B1: solving the intersection point between the target ray and the plane where the target is located.
子步骤S104B2:将所述目标射线与所述目标所在的平面的交点的位置作为所述目标在3D空间的第一位置。Sub-step S104B2: taking the position of the intersection of the target ray and the plane where the target is located as the first position of the target in the 3D space.
目标射线与目标所在的平面(例如地面)理论上一定会相交,因此求解所述目标射线与所述目标所在的平面的交点,该交点的位置即为所述目标在3D空间的第一位置。The target ray and the plane where the target is located (such as the ground) will theoretically intersect, so the intersection point of the target ray and the plane where the target is located is solved, and the position of the intersection point is the first position of the target in the 3D space.
需要说明的是,根据所述目标对应的一个目标射线和所述目标所在的平面确定所述目标在3D空间的第一位置还可以采用其他方式,例如:在一些实施例中,如果目标在所在的平面的某一个具体的直线上(例如行人在地面的斑马线上),那么根据所述目标对应的一个目标射线和该直线也可以确定目标在3D空间的第一位置;等等。It should be noted that other methods can also be used to determine the first position of the target in the 3D space based on a target ray corresponding to the target and the plane where the target is located. For example, in some embodiments, if the target is on a specific straight line in the plane (such as a pedestrian on a zebra crossing on the ground), then the first position of the target in the 3D space can also be determined based on a target ray corresponding to the target and the straight line; and so on.
目标在3D空间的第一位置是客观存在的,但是目标在3D空间的第一位置的表示方式与选定的坐标系有关,在不同的坐标系下,目标在3D空间的第一位置的表示方式不同。The first position of the target in the 3D space exists objectively, but the representation method of the first position of the target in the 3D space is related to the selected coordinate system. In different coordinate systems, the representation method of the first position of the target in the 3D space is different.
在一些实施例中,步骤S104,所述根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置,可以包括:根据所述目标对应的目标射线,确定所述目标在世界坐标系下的第一位置。In some embodiments, step S104, determining the first position of the target in the 3D space according to the target ray corresponding to the target, may include: determining the first position of the target in the world coordinate system according to the target ray corresponding to the target.
摄像装置可安放在环境中的任意位置,在环境中选择一个参考坐标系来描述摄像装置和环境中任何目标(物体)的位置,该坐标系称为世界坐标系(World coordinate system)。与世界坐标系相关的坐标系还包括相机坐标系(Camera coordinate system)、图像坐标系(Pixel coordinate system)以及像平面坐标系(Retinal coordinate system)。The camera device can be placed at any position in the environment. A reference coordinate system is selected in the environment to describe the position of the camera device and any target (object) in the environment. This coordinate system is called the world coordinate system. Coordinate systems related to the world coordinate system also include the camera coordinate system, the pixel coordinate system, and the retinal coordinate system.
摄像装置采集的检测图像(即数字图像)在计算机内可以存储为数组,数组中的每一个元素(象素,pixel)的值即是图像点的亮度(灰度);在图像上定义直角坐标系u-v,每一象素的坐标(u,v)分别是该象素在数组中的列数和行数;故(u,v)是以象素为单位的图像坐标系(也称为像素坐标系)的坐标。The detection image (i.e., digital image) captured by the camera device can be stored as an array in the computer. The value of each element (pixel) in the array is the brightness (grayscale) of the image point; a rectangular coordinate system u-v is defined on the image, and the coordinates (u, v) of each pixel are the column number and row number of the pixel in the array respectively; therefore, (u, v) is the coordinate of the image coordinate system (also called pixel coordinate system) with pixels as the unit.
由于图像坐标系只表示象素位于数字图像的列数和行数,并没有用物理单位表示出该象素在图像中的物理位置,因而需要再建立以物理单位(例如厘米)表示的像平面坐标系x-y;用(x,y)表示以物理单位度量的像平面坐标系的坐标。在x-y坐标系中,原点定义在摄像装置光轴和图像平面的交点处,称为图像的主点(principal point),该点一般位于图像中心处,在图像坐标系下的坐标为(u0,v0),每个象素在x轴和y轴方向上的物理尺寸为dx、dy,两个坐标系的关系如下:
Since the image coordinate system only indicates the number of columns and rows of the pixel in the digital image, and does not use physical units to indicate the physical position of the pixel in the image, it is necessary to establish an image plane coordinate system xy expressed in physical units (e.g. centimeters); (x, y) is used to represent the coordinates of the image plane coordinate system measured in physical units. In the xy coordinate system, the origin is defined at the intersection of the optical axis of the camera device and the image plane, which is called the principal point of the image. This point is generally located at the center of the image, and its coordinates in the image coordinate system are (u0, v0). The physical dimensions of each pixel in the x-axis and y-axis directions are dx and dy, and the relationship between the two coordinate systems is as follows:
Since the image coordinate system only indicates the number of columns and rows of the pixel in the digital image, and does not use physical units to indicate the physical position of the pixel in the image, it is necessary to establish an image plane coordinate system xy expressed in physical units (e.g. centimeters); (x, y) is used to represent the coordinates of the image plane coordinate system measured in physical units. In the xy coordinate system, the origin is defined at the intersection of the optical axis of the camera device and the image plane, which is called the principal point of the image. This point is generally located at the center of the image, and its coordinates in the image coordinate system are (u0, v0). The physical dimensions of each pixel in the x-axis and y-axis directions are dx and dy, and the relationship between the two coordinate systems is as follows:
其中s'表示因摄像装置像平面坐标轴相互不正交引出的倾斜因子(skew factor)。
Wherein s' represents a skew factor caused by the non-orthogonality of the image plane coordinate axes of the camera device.
相机坐标系的原点为摄像装置的光心,其x轴与y轴与图像的X,Y轴平行,z轴为摄像装置的光轴,它与图像平面垂直,以此构成的空间直角坐标系称为相机坐标系,相机坐标系是三维坐标系。光轴与图像平面的交点,即为图像坐标系的原点,图像坐标系的原点与图像的X、Y轴构成的直角坐标系即为图像坐标系,图像坐标系是二维坐标系。相机坐标系与世界坐标系之间的关系可以用旋转矩阵R与平移向量t来描述。The origin of the camera coordinate system is the optical center of the camera device. Its x-axis and y-axis are parallel to the X and Y axes of the image. The z-axis is the optical axis of the camera device, which is perpendicular to the image plane. The spatial rectangular coordinate system formed by this is called the camera coordinate system. The camera coordinate system is a three-dimensional coordinate system. The intersection of the optical axis and the image plane is the origin of the image coordinate system. The rectangular coordinate system formed by the origin of the image coordinate system and the X and Y axes of the image is the image coordinate system. The image coordinate system is a two-dimensional coordinate system. The relationship between the camera coordinate system and the world coordinate system can be described by the rotation matrix R and the translation vector t.
在一些实施例中,步骤S103,所述根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,还可以包括:子步骤S1031、子步骤S1032以及子步骤S1033,如图4所示。In some embodiments, step S103, determining the target ray corresponding to the target based on the position of the imaging point of the target on the image plane of the camera device and the position of the optical center of the camera device, may also include: sub-step S1031, sub-step S1032 and sub-step S1033, as shown in Figure 4.
子步骤S1031:根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的内参公式,确定所述目标的成像点在相机坐标系下的第一坐标。Sub-step S1031: determining the first coordinate of the imaging point of the target in the camera coordinate system according to the position of the imaging point of the target on the image plane of the imaging device and the internal parameter formula of the imaging device.
目标的成像点在所述摄像装置的像平面上的位置可以理解为目标的成像点在像平面坐标系下的坐标,将目标的成像点在像平面坐标系下的坐标转换到目标的成像点在相机坐标系下的第一坐标,需要摄像装置的内参公式,如果摄像装置Ak中检测到的目标Oi的像平面坐标系下的坐标为(xi,yi,1)k,根据摄像装置的内参公式K,即可得到如下关系式:
The position of the imaging point of the target on the image plane of the camera device can be understood as the coordinates of the imaging point of the target in the image plane coordinate system. To convert the coordinates of the imaging point of the target in the image plane coordinate system to the first coordinates of the imaging point of the target in the camera coordinate system, the internal parameter formula of the camera device is required. If the coordinates of the target O i detected in the camera device Ak in the image plane coordinate system are (x i , y i , 1) k , according to the internal parameter formula K of the camera device, the following relationship can be obtained:
The position of the imaging point of the target on the image plane of the camera device can be understood as the coordinates of the imaging point of the target in the image plane coordinate system. To convert the coordinates of the imaging point of the target in the image plane coordinate system to the first coordinates of the imaging point of the target in the camera coordinate system, the internal parameter formula of the camera device is required. If the coordinates of the target O i detected in the camera device Ak in the image plane coordinate system are (x i , y i , 1) k , according to the internal parameter formula K of the camera device, the following relationship can be obtained:
由于在后续的目标射线计算中不需要深度,可将等式两边将Zi约掉,也可将Zi带入为任意常数,若采用约掉方法可得到如下关系式:
Since the depth is not required in the subsequent target ray calculation, Zi can be eliminated on both sides of the equation, or Zi can be substituted as an arbitrary constant. If the elimination method is used, the following relationship can be obtained:
Since the depth is not required in the subsequent target ray calculation, Zi can be eliminated on both sides of the equation, or Zi can be substituted as an arbitrary constant. If the elimination method is used, the following relationship can be obtained:
其中(Xi',Yi',1)k可以看做是目标的成像点在相机坐标系下的第一坐标。Among them, (X i ',Y i ',1) k can be regarded as the first coordinate of the imaging point of the target in the camera coordinate system.
子步骤S1032:根据所述目标的成像点在所述相机坐标系下的第一坐标、所述摄像装置的光心在所述相机坐标系下的第二坐标以及所述摄像装置的外参矩阵,确定所述目标的成像点在所述世界坐标系下的第三坐标以及所述摄像装置的光心在所述世界坐标系下的第四坐标。Sub-step S1032: Determine the third coordinate of the imaging point of the target in the world coordinate system and the fourth coordinate of the optical center of the camera device in the world coordinate system based on the first coordinate of the imaging point of the target in the camera coordinate system, the second coordinate of the optical center of the camera device in the camera coordinate system, and the extrinsic parameter matrix of the camera device.
以目标的成像点在相机坐标系下的第一坐标的表示方式(Xi',Yi',1)k为例,摄像装置的光心在所述相机坐标系下的第二坐标即为相机坐标系的原点(0,0,0)k,利用摄像装置的外参矩阵可以求得目标的成像点在相机坐标系下的第一坐标(Xi',Yi',1)k和相机坐标系的原点(0,0,0)k分别在所述世界坐标系下的第三坐标第四坐标
Taking the first coordinate of the imaging point of the target in the camera coordinate system as an example, (X i ',Y i ',1) k , the second coordinate of the optical center of the camera device in the camera coordinate system is the origin of the camera coordinate system (0,0,0) k . The external parameter matrix of the camera device can be used to obtain the first coordinate of the imaging point of the target in the camera coordinate system (X i ',Y i ',1) k and the origin of the camera coordinate system (0,0,0) k in the world coordinate system. The fourth coordinate
Taking the first coordinate of the imaging point of the target in the camera coordinate system as an example, (X i ',Y i ',1) k , the second coordinate of the optical center of the camera device in the camera coordinate system is the origin of the camera coordinate system (0,0,0) k . The external parameter matrix of the camera device can be used to obtain the first coordinate of the imaging point of the target in the camera coordinate system (X i ',Y i ',1) k and the origin of the camera coordinate system (0,0,0) k in the world coordinate system. The fourth coordinate
子步骤S1033:根据所述目标的成像点在所述世界坐标系下的第三坐标以及所述摄像装置的光心在所述世界坐标系下的第四坐标,确定所述目标在所述世界坐标系下对应的目标射线。Sub-step S1033: determining a target ray corresponding to the target in the world coordinate system according to the third coordinate of the imaging point of the target in the world coordinate system and the fourth coordinate of the optical center of the camera device in the world coordinate system.
以第三坐标的表示方式第四坐标的表示方式为例,利用第三坐标第四坐标可得到摄像装置Ak对于目标Oi的目标射线:
In the third coordinate representation The fourth coordinate is expressed as For example, using the third coordinate The fourth coordinate The target ray of the camera device Ak for the target Oi can be obtained:
In the third coordinate representation The fourth coordinate is expressed as For example, using the third coordinate The fourth coordinate The target ray of the camera device Ak for the target Oi can be obtained:
最终目标在世界坐标系下的第一位置的坐标为:
The coordinates of the first position of the final target in the world coordinate system are:
The coordinates of the first position of the final target in the world coordinate system are:
其中表示空间中的点到目标射线的距离,求解目标在世界坐标系下的第一位置的坐标可采用但不限于梯度下降法、牛顿法、共轭梯度法等最优化方法、近似数(值)解法等。in Represents the distance from a point in space to the target ray. The coordinates of the first position of the target in the world coordinate system can be solved by optimization methods such as but not limited to gradient descent method, Newton method, conjugate gradient method, approximate number (value) solution method, etc.
在一些实施例中,所述方法还包括:步骤S105和步骤S106,如图5所示。In some embodiments, the method further includes: step S105 and step S106, as shown in FIG5 .
步骤S105:根据所述检测图像确定所述目标在3D空间的第二位置。Step S105: determining a second position of the target in the 3D space according to the detection image.
本公开实施例根据所述检测图像按照目前的相关技术(包括基于纯视觉的3D目标检测算法)可以确定目标在3D空间的第二位置(即不是按照目标射线确定目标在3D空间的位置)。The disclosed embodiment can determine the second position of the target in the 3D space (ie, not the position of the target in the 3D space according to the target ray) based on the detection image according to current related technologies (including a 3D target detection algorithm based on pure vision).
步骤S106:根据所述第一位置和所述第二位置,确定所述目标在3D空间的最终位置。Step S106: Determine the final position of the target in the 3D space according to the first position and the second position.
将按照目标射线确定的目标在3D空间的第一位置和按照相关技术确定的目标在3D空间的第二位置结合起来,确定所述目标在3D空间的最终位置。如此能够进一步获得准确率更高的目标的位置。The first position of the target in the 3D space determined according to the target ray and the second position of the target in the 3D space determined according to the relevant technology are combined to determine the final position of the target in the 3D space, so that the position of the target with higher accuracy can be further obtained.
在一些实施例中,所述方法还包括:步骤S107和步骤S108,如图6所示。In some embodiments, the method further includes: step S107 and step S108, as shown in FIG6 .
步骤S107:获取所述第一位置对应的第一置信度或者第一方差和所述第二位置对应的第二置信度或者第二方差。Step S107: Obtain a first confidence level or a first variance corresponding to the first position and a second confidence level or a second variance corresponding to the second position.
步骤S108:根据所述第一置信度或者第一方差、所述第二置信度或者第二方差,确定所述第一位置对应的第一权重和所述第二位置对应的第二权重,所述第一权重和所述第二权重之和等于1。Step S108: Determine a first weight corresponding to the first position and a second weight corresponding to the second position according to the first confidence or first variance and the second confidence or second variance, wherein the sum of the first weight and the second weight is equal to 1.
此时,步骤S106,所述根据所述第一位置和所述第二位置,确定所述目标在3D空间的最终位置,还可以包括:根据所述第一位置、所述第二位置、所述第一权重以及所述第二权重,确定所述目标在3D空间的最终位置。At this time, step S106, determining the final position of the target in the 3D space according to the first position and the second position, may also include: determining the final position of the target in the 3D space according to the first position, the second position, the first weight and the second weight.
置信度也称为可靠度、置信水平、或置信系数,是一个概率值,即在抽样对总体参数作出估计时,由于样本的随机性,其结论总是不确定的。因此,采用一种概率的陈述方法,也就是数理统计中的区间估计法,即估计值与总体参数在一定允许的误差范围以内,其相应的概率有多大,这个相应的概率称作置信度(或者总体参数值落在样本统计值某一区内的概率)。置信度是描述不确定性的重要指标之一,置信度表示区间估计的把握程度。第一置信度可以是第一位置对应的可靠度,第二置信度可以是第二位置对应的可靠度。
Confidence is also called reliability, confidence level, or confidence coefficient. It is a probability value. That is, when sampling is used to estimate the population parameter, the conclusion is always uncertain due to the randomness of the sample. Therefore, a probability statement method is used, which is the interval estimation method in mathematical statistics. That is, the corresponding probability that the estimated value and the population parameter are within a certain allowable error range is called confidence (or the probability that the population parameter value falls within a certain range of the sample statistical value). Confidence is one of the important indicators to describe uncertainty. Confidence indicates the degree of certainty of interval estimation. The first confidence level can be the reliability corresponding to the first position, and the second confidence level can be the reliability corresponding to the second position.
概率论中方差用来度量随机变量和其数学期望(即均值)之间的偏离程度。第一方差可以是第一位置对应的偏离程度,第二方差可以是第二位置对应的偏离程度。如果摄像装置的数量是多个,可以采用置信度,如果摄像装置的数量是一个,可以采用方差。In probability theory, variance is used to measure the degree of deviation between a random variable and its mathematical expectation (i.e., mean). The first variance can be the degree of deviation corresponding to the first position, and the second variance can be the degree of deviation corresponding to the second position. If there are multiple cameras, confidence can be used, and if there is only one camera, variance can be used.
第一权重可以是指目标在3D空间的第一位置占确定目标在3D空间的最终位置的比例,第二权重可以是指目标在3D空间的第二位置占确定目标在3D空间的最终位置的比例。第一权重和所述第二权重之和等于1。根据第一位置、所述第二位置、所述第一权重以及所述第二权重即可确定所述目标在3D空间的最终位置。The first weight may refer to the ratio of the first position of the target in the 3D space to the final position of the target in the 3D space, and the second weight may refer to the ratio of the second position of the target in the 3D space to the final position of the target in the 3D space. The sum of the first weight and the second weight is equal to 1. The final position of the target in the 3D space can be determined according to the first position, the second position, the first weight, and the second weight.
例如:假设第一位置在预定空间范围内的第一置信度为90%,所述第二位置在预定空间范围内的第二置信度为80%,据此确定第一权重为0.55,第二权重为0.45,目标在3D空间的最终位置为第一位置*0.55+第二位置*0.45。For example: assuming that the first confidence level of the first position within the predetermined spatial range is 90%, and the second confidence level of the second position within the predetermined spatial range is 80%, the first weight is determined to be 0.55, the second weight is 0.45, and the final position of the target in the 3D space is first position * 0.55 + second position * 0.45.
下面详细说明本公开实施例的方法的应用。The application of the method of the embodiment of the present disclosure is described in detail below.
本公开实施例可以包括3个模块,如图7所示:The embodiment of the present disclosure may include three modules, as shown in FIG7 :
模块A1-An:为独立的视觉检测模块,每个视觉检测模块单独输出其对目标的2D检测结果(包括目标的尺寸、类别、朝向、中心点位置等信息)和对应的置信度,以及其对目标的3D粗检测结果(利用的方法为目前的相关技术,除了包括目标的尺寸等信息外,还包括目标在3D空间的位置,或者深度粗估计等信息)与对应的置信度等。Modules A1-An are independent visual detection modules. Each visual detection module independently outputs its 2D detection results of the target (including information such as the target's size, category, orientation, center point position, etc.) and the corresponding confidence level, as well as its 3D rough detection results of the target (the method used is the current relevant technology, which includes not only information such as the target's size, but also the target's position in 3D space, or rough depth estimation, etc.) and the corresponding confidence level.
模块B:为目标匹配模块,将模块A1-An输出的目标数据进行匹配,即找出每个视觉检测模块对应同一目标的数据。Module B: It is the target matching module, which matches the target data output by modules A1-An, that is, finds the data corresponding to the same target in each visual inspection module.
模块C:为目标射线估计模块。Module C: target ray estimation module.
情况一:该模块将同一目标的不同摄像装置视角下的目标的成像点的位置和摄像装置的光心的位置拟合为目标射线,并计算空间中到这些目标射线的距离之和最小的点或者这些目标射线的交点,从而得出精确的3D位置估计。Case 1: This module fits the positions of the imaging points of the target under different camera viewing angles of the same target and the position of the optical center of the camera as target rays, and calculates the point in space with the smallest sum of distances to these target rays or the intersection of these target rays, thereby obtaining an accurate 3D position estimate.
情况二:该模块将一个摄像装置视角下的目标的成像点的位置和摄像装置的光心的位置拟合为目标射线,并求解该目标射线与地面方程的交点,从而得出精确的3D位置估计。Case 2: This module fits the position of the imaging point of a target under the viewing angle of a camera and the position of the optical center of the camera as a target ray, and solves the intersection of the target ray and the ground equation to obtain an accurate 3D position estimate.
模块D:为加权修正模块,由于目标的2D检测结果与本公开实施例通过目标射线得到的目标的3D检测结果的准确性基本相同,因此将目标的2D检测结果的置信度作为本公开实施例通过目标射线得到的目标的3D检测结果的置信度;该模块以2D检测结果与3D粗检测结果的置信度为依据,首先利用多个已知的2D检测结果与3D粗检测结果的置信度作为训练数据训练出神经网络,然后以2D检测结果与3D粗检测结果的置信度为依据通过已训练的神经网络得到第一权重和第二权重,将模块C的输出和模块A直接得到的3D粗检测结果进行加权求和,得到估计精度更高的目标的最终位置。除了采用神经网络得到权值外,还可通过卡尔曼滤波等方法得到第一权重和第二权重。Module D: is a weighted correction module. Since the accuracy of the target's 2D detection result is basically the same as the target's 3D detection result obtained by the target ray in the embodiment of the present disclosure, the confidence of the target's 2D detection result is used as the confidence of the target's 3D detection result obtained by the target ray in the embodiment of the present disclosure. This module is based on the confidence of the 2D detection result and the 3D rough detection result. First, a neural network is trained using the confidence of multiple known 2D detection results and the 3D rough detection result as training data. Then, the first weight and the second weight are obtained through the trained neural network based on the confidence of the 2D detection result and the 3D rough detection result. The output of module C and the 3D rough detection result directly obtained by module A are weighted and summed to obtain the final position of the target with higher estimation accuracy. In addition to using a neural network to obtain weights, the first weight and the second weight can also be obtained by methods such as Kalman filtering.
实施例一、一种基于3D粗检测结果与多目相机目标射线的3D目标检测系统的结合方案,其结构如图8所示。Embodiment 1: A solution for combining a 3D target detection system based on 3D rough detection results and multi-camera target rays, the structure of which is shown in FIG8 .
第一步、模块A1-An,视觉检测模块,该模块以检测图像为输入,输出检测图像中目标的2D检测结果和对应的置信度,以及其对目标的3D粗检测结果(即第二位置)与对应的置信度等。The first step, module A1-An, visual detection module, takes the detection image as input, and outputs the 2D detection result and corresponding confidence of the target in the detection image, as well as its 3D rough detection result (i.e., the second position) and corresponding confidence of the target.
第二步、模块B,目标匹配模块,利用目标3D粗检测结果,将模块A1-An中检测到的同一目标的数据匹配起来。具体方法可以是:当不同摄像装置在世界坐标系下检测到的目标存
在重合,或两个目标的距离小于一定阈值Y时,则将这些目标对应的数据匹配起来。The second step, module B, the target matching module, uses the target 3D rough detection results to match the data of the same target detected in modules A1-An. The specific method can be: when the target detected by different camera devices in the world coordinate system exists When they overlap or the distance between two targets is less than a certain threshold Y, the data corresponding to these targets are matched.
第三步、模块C,目标射线位置估计模块,利用多个摄像装置视角下的目标的成像点的位置和摄像装置的光心的位置构建多个目标射线,计算得到目标在3D空间的第一位置。原理如图9所示:具体为构建从目标的成像点的位置和摄像装置的光心的位置的目标射线,并得到该目标射线的在世界坐标系下的方程;根据不同相机的结果,得到至少两条目标射线后,计算这些目标射线的交点或与这些目标射线的距离之和最小的最小距离点,将交点或者最小距离点作为目标在3D空间的第一位置。The third step, module C, the target ray position estimation module, uses the position of the imaging point of the target under the perspective of multiple cameras and the position of the optical center of the camera to construct multiple target rays, and calculate the first position of the target in the 3D space. The principle is shown in Figure 9: Specifically, the target ray from the position of the imaging point of the target and the position of the optical center of the camera is constructed, and the equation of the target ray in the world coordinate system is obtained; after obtaining at least two target rays according to the results of different cameras, the intersection of these target rays or the minimum distance point with the minimum sum of distances to these target rays is calculated, and the intersection or the minimum distance point is used as the first position of the target in the 3D space.
最终目标Oi的坐标为:
The coordinates of the final target O i are:
The coordinates of the final target O i are:
本公开实施例由于只有两个相机,因此采用解析几何的数值解法求解坐标Pi。Since there are only two cameras in the embodiment of the present disclosure, the coordinates P i are solved by a numerical solution of analytic geometry.
第四步、模块D将模块A1-An得到的2D检测结果的置信度与3D粗检测结果的置信度输入已训练好的神经网络,得到第一权重和第二权重。将模块C得到的目标在3D空间的第一位置与模块A1-An得到的3D粗检测结果(即第二位置)加权求和得到最终结果,即目标在3D空间的最终位置。In the fourth step, module D inputs the confidence of the 2D detection result obtained by module A1-An and the confidence of the 3D rough detection result into the trained neural network to obtain the first weight and the second weight. The first position of the target in the 3D space obtained by module C and the 3D rough detection result (i.e., the second position) obtained by module A1-An are weighted and summed to obtain the final result, i.e., the final position of the target in the 3D space.
实施例二、一种基于深度粗估计与多目相机目标射线的3D目标检测系统的结合方案,其结构如图10所示。Embodiment 2: A combination solution of a 3D target detection system based on coarse depth estimation and multi-camera target rays, the structure of which is shown in FIG10 .
第一步、模块A1-An,视觉检测模块,该模块以检测图像为输入,输出检测图像中目标的2D检测结果和对应的置信度,以及深度粗估计。The first step, module A1-An, the visual detection module, takes the detection image as input and outputs the 2D detection result and corresponding confidence of the target in the detection image, as well as the rough depth estimation.
第二步、模块B,目标匹配模块,利用深度粗估计,反推出3D粗检测结果(即第二位置),将模块A1-An中检测到的同一目标的数据匹配起来。具体方法可以是:当不同摄像装置在世界坐标系下检测到的目标存在重合,或两个目标的距离小于一定阈值Y时,则将这些目标对应的数据匹配起来。In the second step, module B, the target matching module, uses the coarse depth estimation to infer the 3D coarse detection result (i.e., the second position) and matches the data of the same target detected in modules A1-An. The specific method can be: when the targets detected by different camera devices in the world coordinate system overlap, or the distance between two targets is less than a certain threshold Y, the data corresponding to these targets are matched.
第三步、模块C,目标射线位置估计模块,利用多个摄像装置视角下的目标的成像点的位置和摄像装置的光心的位置构建多个目标射线,计算得到目标在3D空间的第一位置。原理如图9所示:具体为构建从目标的成像点的位置和摄像装置的光心的位置的目标射线,并得到该目标射线的在世界坐标系下的方程;根据不同相机的结果,得到至少两条目标射线后,计算这些目标射线的交点或与这些目标射线的距离之和最小的最小距离点,将交点或者最小距离点作为目标在3D空间的第一位置。The third step, module C, the target ray position estimation module, uses the position of the imaging point of the target under the perspective of multiple cameras and the position of the optical center of the camera to construct multiple target rays, and calculate the first position of the target in the 3D space. The principle is shown in Figure 9: Specifically, the target ray from the position of the imaging point of the target and the position of the optical center of the camera is constructed, and the equation of the target ray in the world coordinate system is obtained; after obtaining at least two target rays according to the results of different cameras, the intersection of these target rays or the minimum distance point with the minimum sum of distances to these target rays is calculated, and the intersection or the minimum distance point is used as the first position of the target in the 3D space.
最终目标Oi的坐标为:
The coordinates of the final target O i are:
The coordinates of the final target O i are:
本公开实施例由于摄像装置数量多,求解困难,因此可以先将摄像装置俩俩分组,每组利用解析几何的数值解法求解坐标Pi,j,最后再将Pi,j加权求和得到Pi。In the embodiment of the present disclosure, since there are many cameras and it is difficult to solve, the cameras can be grouped in pairs, and each group can solve the coordinates P i,j using the numerical solution of analytic geometry, and finally P i can be obtained by weighted summation of P i,j .
第四步、模块D将模块A1-An得到的2D检测结果的置信度与模块B利用深度粗估计反推得到的3D粗检测结果的置信度输入已训练好的神经网络,得到第一权重和第二权重。将模块C得到的目标在3D空间的第一位置与模块B利用深度粗估计反推得到的3D粗检测结果加权求和得到最终结果,即目标在3D空间的最终位置。In the fourth step, module D inputs the confidence of the 2D detection results obtained by modules A1-An and the confidence of the 3D rough detection results obtained by module B using rough depth estimation into the trained neural network to obtain the first weight and the second weight. The first position of the target in the 3D space obtained by module C and the 3D rough detection result obtained by module B using rough depth estimation are weighted and summed to obtain the final result, that is, the final position of the target in the 3D space.
实施例三、一种基于地面方程与单目相机目标射线的3D目标检测系统的结合方案,其结
构如图11所示。Embodiment 3: A combination solution of a 3D target detection system based on ground equations and monocular camera target rays, wherein the The structure is shown in Figure 11.
第一步、模块A,视觉检测模块,该模块以检测图像为输入,输出检测图像中目标的2D检测结果和对应的方差,以及3D粗检测结果和对应的方差。The first step is to use module A, the visual detection module, which takes the detection image as input and outputs the 2D detection result and the corresponding variance of the target in the detection image, as well as the 3D rough detection result and the corresponding variance.
第二步、模块C,射线位置估计模块,利用一个摄像装置视角下的目标的成像点的位置和摄像装置的光心的位置构建一个目标射线,与地面方程联和,计算得到目标在3D空间的第一位置。原理如图12所示:具体为构建从目标的成像点的位置和摄像装置的光心的位置的目标射线,并得到该目标射线在世界坐标系下的方程;再将该目标射线在世界坐标系下的方程与地面方程联立,求解得到目标在3D空间的第一位置。The second step, module C, the ray position estimation module, uses the position of the imaging point of the target under the viewing angle of a camera device and the position of the optical center of the camera device to construct a target ray, and combines it with the ground equation to calculate the first position of the target in the 3D space. The principle is shown in Figure 12: Specifically, the target ray is constructed from the position of the imaging point of the target and the position of the optical center of the camera device, and the equation of the target ray in the world coordinate system is obtained; then the equation of the target ray in the world coordinate system is combined with the ground equation to solve the first position of the target in the 3D space.
目标射线在世界坐标系下的方程:
The equation of the target ray in the world coordinate system is:
The equation of the target ray in the world coordinate system is:
地面方程:
Ux+Vy+Wz+D=0Ground equation:
Ux+Vy+Wz+D=0
Ux+Vy+Wz+D=0Ground equation:
Ux+Vy+Wz+D=0
将目标射线在世界坐标系下的方程与地面方程联立可求解出目标Oi在3D空间的第一位置。The first position of the target O i in the 3D space can be solved by combining the equation of the target ray in the world coordinate system with the ground equation.
第四步、模块D,加权修正模块,利用A模块得到的2D检测结果的方差与3D粗检测结果的方差,通过卡尔曼滤波方法得到第一权重和第二权重,将模块C得到的目标在3D空间的第一位置与模块A直接输出的3D粗检测结果进行加权求和得到最终结果,即目标在3D空间的最终位置。The fourth step, module D, the weighted correction module, uses the variance of the 2D detection result obtained by module A and the variance of the 3D rough detection result to obtain the first weight and the second weight through the Kalman filtering method, and performs weighted summation on the first position of the target in the 3D space obtained by module C and the 3D rough detection result directly output by module A to obtain the final result, that is, the final position of the target in the 3D space.
参见图13,图13是本公开目标检测装置一实施例的结构示意图,需要说明的是,本公开实施例的目标检测装置能够实现上述目标检测方法,相关内容的详细说明,请参见上述方法部分,在此不再赘叙。Refer to Figure 13, which is a structural diagram of an embodiment of the target detection device of the present disclosure. It should be noted that the target detection device of the embodiment of the present disclosure can implement the above-mentioned target detection method. For detailed description of related contents, please refer to the above-mentioned method part, which will not be repeated here.
所述装置100包括存储器1以及处理器2,所述存储器1设置为存储计算机程序;所述处理器2设置为执行所述计算机程序并在执行所述计算机程序时实现如上任一所述的目标检测方法。存储器1通过总线与处理器2连接。The device 100 includes a memory 1 and a processor 2, wherein the memory 1 is configured to store a computer program; the processor 2 is configured to execute the computer program and implement any of the above target detection methods when executing the computer program. The memory 1 is connected to the processor 2 via a bus.
其中,处理器2可以是微控制单元、中央处理单元或数字信号处理器,等等。存储器1可以是Flash芯片、只读存储器、磁盘、光盘、U盘或者移动硬盘等等。The processor 2 may be a microcontroller unit, a central processing unit or a digital signal processor, etc. The memory 1 may be a Flash chip, a read-only memory, a disk, an optical disk, a USB flash drive or a mobile hard disk, etc.
本公开实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上任一所述的目标检测方法。An embodiment of the present disclosure further provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor implements any of the target detection methods described above.
其中,该计算机可读存储介质可以是上述目标检测装置的内部存储单元,例如硬盘或内存。该计算机可读存储介质也可以是上述目标检测装置的外部存储设备,例如配备的插接式硬盘、智能存储卡、安全数字卡、闪存卡,等等。The computer-readable storage medium may be an internal storage unit of the target detection device, such as a hard disk or a memory. The computer-readable storage medium may also be an external storage device of the target detection device, such as a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, etc.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。Those skilled in the art will appreciate that all or some of the steps in the methods disclosed above, and the functional modules/units in the systems and devices may be implemented as software, firmware, hardware, or a suitable combination thereof.
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成
电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在设置为存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以设置为存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。In hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or implemented as hardware, or implemented as an integrated circuit, such as an application-specific integrated circuit. Circuit. Such software can be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or temporary media). As known to those of ordinary skill in the art, the term computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology configured to store information (such as computer-readable instructions, data structures, program modules or other data). Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, disk storage or other magnetic storage devices, or any other medium that can be configured to store desired information and can be accessed by a computer. In addition, it is known to those of ordinary skill in the art that communication media typically contain computer-readable instructions, data structures, program modules or other data in modulated data signals such as carrier waves or other transmission mechanisms, and may include any information delivery medium.
以上参照附图说明了本公开的优选实施例,并非因此局限本公开的权利范围。本领域技术人员不脱离本公开的范围和实质内所作的任何修改、等同替换和改进,均应在本公开的权利范围之内。
The preferred embodiments of the present disclosure are described above with reference to the accompanying drawings, but the scope of the present disclosure is not limited thereby. Any modification, equivalent substitution and improvement made by those skilled in the art without departing from the scope and essence of the present disclosure shall be within the scope of the present disclosure.
Claims (12)
- 一种目标检测方法,所述方法包括:A target detection method, the method comprising:获取摄像装置的检测图像,所述检测图像包括目标对应的图像;Acquire a detection image of the camera device, wherein the detection image includes an image corresponding to the target;根据所述检测图像确定所述目标的成像点在所述摄像装置的像平面上的位置;Determining the position of the imaging point of the target on the image plane of the camera device according to the detection image;根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,所述目标射线通过所述目标的成像点的位置和所述摄像装置的光心的位置;Determine a target ray corresponding to the target according to the position of the imaging point of the target on the image plane of the imaging device and the position of the optical center of the imaging device, wherein the target ray passes through the position of the imaging point of the target and the position of the optical center of the imaging device;根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置。A first position of the target in the 3D space is determined according to the target ray corresponding to the target.
- 根据权利要求1所述的方法,其中,所述摄像装置的数量为多个;The method according to claim 1, wherein the number of the camera devices is multiple;所述根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置,包括:The determining, according to the target ray corresponding to the target, a first position of the target in the 3D space includes:根据所述目标对应的多个目标射线,确定所述目标在3D空间的第一位置。A first position of the target in the 3D space is determined according to a plurality of target rays corresponding to the target.
- 根据权利要求2所述的方法,其中,所述根据所述目标对应的多个目标射线,确定所述目标在3D空间的第一位置,包括:The method according to claim 2, wherein determining the first position of the target in the 3D space according to the multiple target rays corresponding to the target comprises:求解所述3D空间中到各个所述目标射线的距离之和最小的最小距离点;Finding the minimum distance point in the 3D space where the sum of distances to each of the target rays is the smallest;将所述最小距离点的位置作为所述目标在3D空间的第一位置。The position of the minimum distance point is used as the first position of the target in the 3D space.
- 根据权利要求3所述的方法,其中,所述将所述最小距离点的位置作为所述目标在3D空间的第一位置,包括:The method according to claim 3, wherein taking the position of the minimum distance point as the first position of the target in the 3D space comprises:将多个所述目标射线的交点的位置作为所述目标在3D空间的第一位置。The position of the intersection of the plurality of target rays is used as the first position of the target in the 3D space.
- 根据权利要求1所述的方法,其中,所述摄像装置的数量为一个;The method according to claim 1, wherein the number of the camera device is one;所述根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置,包括:The determining, according to the target ray corresponding to the target, a first position of the target in the 3D space includes:根据所述目标对应的一个目标射线和所述目标所在的平面,确定所述目标在3D空间的第一位置。A first position of the target in the 3D space is determined according to a target ray corresponding to the target and a plane where the target is located.
- 根据权利要求5所述的方法,其中,所述根据所述目标对应的一个目标射线和所述目标所在的平面,确定所述目标在3D空间的第一位置,包括:The method according to claim 5, wherein determining the first position of the target in the 3D space according to a target ray corresponding to the target and the plane where the target is located comprises:求解所述目标射线与所述目标所在的平面的交点;Solving the intersection point between the target ray and the plane where the target is located;将所述目标射线与所述目标所在的平面的交点的位置作为所述目标在3D空间的第一位置。The position of the intersection of the target ray and the plane where the target is located is used as the first position of the target in the 3D space.
- 根据权利要求1-6任一项所述的方法,其中,所述根据所述目标对应的目标射线,确定所述目标在3D空间的第一位置,包括:The method according to any one of claims 1 to 6, wherein determining the first position of the target in the 3D space according to the target ray corresponding to the target comprises:根据所述目标对应的目标射线,确定所述目标在世界坐标系下的第一位置。A first position of the target in a world coordinate system is determined according to a target ray corresponding to the target.
- 根据权利要求7所述的方法,其中,所述根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的光心的位置,确定所述目标对应的目标射线,包括:The method according to claim 7, wherein the step of determining the target ray corresponding to the target based on the position of the imaging point of the target on the image plane of the imaging device and the position of the optical center of the imaging device comprises:根据所述目标的成像点在所述摄像装置的像平面上的位置和所述摄像装置的内参公式,确定所述目标的成像点在相机坐标系下的第一坐标;Determine a first coordinate of the imaging point of the target in a camera coordinate system according to a position of the imaging point of the target on an image plane of the imaging device and an internal parameter formula of the imaging device;根据所述目标的成像点在所述相机坐标系下的第一坐标、所述摄像装置的光心在所述相机坐标系下的第二坐标以及所述摄像装置的外参矩阵,确定所述目标的成像点在所述世界坐标系下的第三坐标以及所述摄像装置的光心在所述世界坐标系下的第四坐标;Determine, according to the first coordinate of the imaging point of the target in the camera coordinate system, the second coordinate of the optical center of the imaging device in the camera coordinate system, and the extrinsic parameter matrix of the imaging device, the third coordinate of the imaging point of the target in the world coordinate system and the fourth coordinate of the optical center of the imaging device in the world coordinate system;根据所述目标的成像点在所述世界坐标系下的第三坐标以及所述摄像装置的光心在所述 世界坐标系下的第四坐标,确定所述目标在所述世界坐标系下对应的目标射线。According to the third coordinate of the imaging point of the target in the world coordinate system and the optical center of the camera device in the The fourth coordinate in the world coordinate system determines the target ray corresponding to the target in the world coordinate system.
- 根据权利要求1-6任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 6, wherein the method further comprises:根据所述检测图像确定所述目标在3D空间的第二位置;Determine a second position of the target in the 3D space according to the detection image;根据所述第一位置和所述第二位置,确定所述目标在3D空间的最终位置。A final position of the target in the 3D space is determined according to the first position and the second position.
- 根据权利要求9所述的方法,其中,所述方法还包括:The method according to claim 9, wherein the method further comprises:获取所述第一位置对应的第一置信度或者第一方差和所述第二位置对应的第二置信度或者第二方差;Obtaining a first confidence level or a first variance corresponding to the first position and a second confidence level or a second variance corresponding to the second position;根据所述第一置信度或者第一方差、所述第二置信度或者第二方差,确定所述第一位置对应的第一权重和所述第二位置对应的第二权重,所述第一权重和所述第二权重之和等于1;Determine, according to the first confidence or the first variance and the second confidence or the second variance, a first weight corresponding to the first position and a second weight corresponding to the second position, wherein the sum of the first weight and the second weight is equal to 1;所述根据所述第一位置和所述第二位置,确定所述目标在3D空间的最终位置,包括:Determining a final position of the target in the 3D space according to the first position and the second position includes:根据所述第一位置、所述第二位置、所述第一权重以及所述第二权重,确定所述目标在3D空间的最终位置。A final position of the target in the 3D space is determined according to the first position, the second position, the first weight, and the second weight.
- 一种目标检测装置,所述装置包括存储器以及处理器,所述存储器设置为存储计算机程序;所述处理器设置为执行所述计算机程序并在执行所述计算机程序时实现如权利要求1-10任一项所述的目标检测方法。A target detection device, comprising a memory and a processor, wherein the memory is configured to store a computer program; the processor is configured to execute the computer program and implement the target detection method according to any one of claims 1 to 10 when executing the computer program.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如权利要求1-10任一项所述的目标检测方法。 A computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, the processor implements the target detection method according to any one of claims 1 to 10.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310002393.9A CN118334113A (en) | 2023-01-03 | 2023-01-03 | Target detection method, device and storage medium |
CN202310002393.9 | 2023-01-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024146365A1 true WO2024146365A1 (en) | 2024-07-11 |
Family
ID=91776438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/139598 WO2024146365A1 (en) | 2023-01-03 | 2023-12-18 | Object detection method and apparatus and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118334113A (en) |
WO (1) | WO2024146365A1 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102622747A (en) * | 2012-02-16 | 2012-08-01 | 北京航空航天大学 | Camera parameter optimization method for vision measurement |
US20200357134A1 (en) * | 2019-05-09 | 2020-11-12 | Trimble Inc. | Target positioning with bundle adjustment |
CN112017238A (en) * | 2019-05-30 | 2020-12-01 | 北京初速度科技有限公司 | Method and device for determining spatial position information of linear object |
CN112771854A (en) * | 2020-04-14 | 2021-05-07 | 深圳市大疆创新科技有限公司 | Projection display method, system, terminal and storage medium based on multiple camera devices |
CN112967342A (en) * | 2021-03-18 | 2021-06-15 | 深圳大学 | High-precision three-dimensional reconstruction method and system, computer equipment and storage medium |
CN115439531A (en) * | 2022-06-21 | 2022-12-06 | 亮风台(上海)信息科技有限公司 | Method and equipment for acquiring target space position information of target object |
-
2023
- 2023-01-03 CN CN202310002393.9A patent/CN118334113A/en active Pending
- 2023-12-18 WO PCT/CN2023/139598 patent/WO2024146365A1/en unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102622747A (en) * | 2012-02-16 | 2012-08-01 | 北京航空航天大学 | Camera parameter optimization method for vision measurement |
US20200357134A1 (en) * | 2019-05-09 | 2020-11-12 | Trimble Inc. | Target positioning with bundle adjustment |
CN112017238A (en) * | 2019-05-30 | 2020-12-01 | 北京初速度科技有限公司 | Method and device for determining spatial position information of linear object |
CN112771854A (en) * | 2020-04-14 | 2021-05-07 | 深圳市大疆创新科技有限公司 | Projection display method, system, terminal and storage medium based on multiple camera devices |
CN112967342A (en) * | 2021-03-18 | 2021-06-15 | 深圳大学 | High-precision three-dimensional reconstruction method and system, computer equipment and storage medium |
CN115439531A (en) * | 2022-06-21 | 2022-12-06 | 亮风台(上海)信息科技有限公司 | Method and equipment for acquiring target space position information of target object |
Also Published As
Publication number | Publication date |
---|---|
CN118334113A (en) | 2024-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111354042B (en) | Feature extraction method and device of robot visual image, robot and medium | |
US11668571B2 (en) | Simultaneous localization and mapping (SLAM) using dual event cameras | |
US11830216B2 (en) | Information processing apparatus, information processing method, and storage medium | |
US8290305B2 (en) | Registration of 3D point cloud data to 2D electro-optical image data | |
JP2021184307A (en) | System and method for detecting lines with vision system | |
JP5759161B2 (en) | Object recognition device, object recognition method, learning device, learning method, program, and information processing system | |
CN105627932A (en) | Distance measurement method and device based on binocular vision | |
WO2017077925A1 (en) | Method and system for estimating three-dimensional pose of sensor | |
US11212511B1 (en) | Residual error mitigation in multiview calibration | |
CN112288813B (en) | Pose estimation method based on multi-eye vision measurement and laser point cloud map matching | |
US20130028482A1 (en) | Method and System for Thinning a Point Cloud | |
EP3166074A1 (en) | Method of camera calibration for a multi-camera system and apparatus performing the same | |
JP2017090450A (en) | System and method for detecting lines in a vision system | |
WO2022217988A1 (en) | Sensor configuration scheme determination method and apparatus, computer device, storage medium, and program | |
CN109961092B (en) | Binocular vision stereo matching method and system based on parallax anchor point | |
Angladon et al. | The toulouse vanishing points dataset | |
CN114494393B (en) | Space measurement method, device, equipment and storage medium based on monocular camera | |
WO2024146365A1 (en) | Object detection method and apparatus and storage medium | |
Kotov et al. | DEM generation based on RPC model using relative conforming estimate criterion | |
CN112950709A (en) | Pose prediction method, pose prediction device and robot | |
CN113139454B (en) | Road width extraction method and device based on single image | |
US9824455B1 (en) | Detecting foreground regions in video frames | |
US9842402B1 (en) | Detecting foreground regions in panoramic video frames | |
CN112907650A (en) | Cloud height measuring method and equipment based on binocular vision | |
CN113379816A (en) | Structure change detection method, electronic device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23914499 Country of ref document: EP Kind code of ref document: A1 |