CN112711972B - Target detection method and device - Google Patents
Target detection method and device Download PDFInfo
- Publication number
- CN112711972B CN112711972B CN201911026844.2A CN201911026844A CN112711972B CN 112711972 B CN112711972 B CN 112711972B CN 201911026844 A CN201911026844 A CN 201911026844A CN 112711972 B CN112711972 B CN 112711972B
- Authority
- CN
- China
- Prior art keywords
- target
- boundary
- box
- bounding
- bounding boxes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a target detection method and a target detection device, wherein the method comprises the following steps: obtaining a plurality of boundary boxes on a target image and scores of the boundary boxes, wherein the scores are used for representing the confidence level of the target object contained in the boundary boxes; dividing the target image to obtain a plurality of grids, and determining grids to which the plurality of bounding boxes belong; traversing the multiple boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; and determining a target detection result according to the score of the target boundary box. According to the embodiment of the application, the grids belonging to the boundary frame are determined by carrying out grid division on the target image, and then the adjacent relation of the boundary frame is determined according to the adjacent relation between the grids, so that the overlapping degree of the reference boundary frame and the adjacent boundary frame only needs to be calculated when NMS algorithm operation is carried out, the time complexity of NMS is effectively reduced, and the target detection efficiency is improved.
Description
Technical Field
The present application relates to the field of image processing, and in particular, to a target detection method and apparatus.
Background
With the rapid development of deep learning and computer vision, the related art has been widely used in various fields. Object Detection (Object Detection) is an important part of image processing, and its task is to find all objects (objects) of interest in an image, and to determine their position and size, which is one of the core problems in the field of machine vision.
When the deep learning method is adopted to detect the target, two steps can be included, namely, a bounding box (bbox) and a bounding box score are firstly generated on the target image, wherein the higher the bounding box score value is, the higher the probability that an object in the bbox box is the target object is; and screening out high-score boundary frames with the boundary frame scores meeting preset conditions, and determining the objects in the high-score boundary frames as target objects to be identified.
In this process, in order to avoid the object from missing, the bbox frames are densely arranged when the bbox frames are generated, so that a plurality of frames appear on the same object, and in order to select the optimal frame from a plurality of overlapped frames, a Non-maximum suppression (Non-Maximum Suppression, NMS) algorithm is introduced, and the low score frames overlapped with the high score boundary frames are filtered through the calculation of the cross-over ratio (Intersection Over Union, IOU), so as to obtain the optimal frame on the object.
However, each selected high score frame in the NMS algorithm needs to perform IOU calculation with all low score frames with scores lower than the selected high score frame, and in the worst case, the complexity of the algorithm is O (N≡2), so when the number of bbox frames increases, the time consumption can be multiplied by the power, the increase of the time consumption of the NMS can lead to the reduction of the detection frame rate, and the detection effect of the algorithm is seriously affected.
Disclosure of Invention
The embodiment of the application provides a target detection method and device, and the proposal of the embodiment of the application can determine grids belonging to a boundary frame by dividing grids of a target image, and then determine the adjacent relation of the boundary frame according to the adjacent relation between the grids, so that only the overlapping degree of a reference boundary frame and the adjacent boundary frame is required to be calculated when NMS algorithm operation is carried out, the time complexity of NMS is effectively reduced, and the target detection efficiency is improved.
In a first aspect, an embodiment of the present application provides a target detection method, including: obtaining a plurality of bounding boxes on a target image and scores of the bounding boxes, wherein the scores are used for representing the confidence level of the target object contained in the bounding boxes; dividing the target image to obtain a plurality of grids, and determining grids to which the plurality of boundary boxes belong; traversing the multiple boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frames, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the bounding boxes, the adjacent bounding box comprises a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box does not comprise the reference bounding box; and determining a target detection result according to the score of the target boundary box.
In the embodiment of the application, a target image is divided to obtain a plurality of grids; determining adjacent boundary frames corresponding to the boundary frames according to the grid division result; and traversing the multiple bounding boxes, calculating to obtain a target bounding box which is not inhibited by the adjacent bounding box in the multiple bounding boxes, and finally determining a target detection result according to the score of the target bounding box. In the process, grids of the boundary frames are determined, and then adjacent boundary frames of the boundary frames are determined, so that only the overlapping degree of the boundary frames and the adjacent boundary frames is required to be calculated when the boundary frames are traversed, the data processing amount is greatly reduced, the data processing efficiency is improved, and the target detection efficiency is further improved.
In an alternative example, before traversing the plurality of bounding boxes, the method further comprises: and sequencing the bounding box according to the score of the bounding box to obtain the sequencing number corresponding to the bounding box.
In the embodiment of the application, the plurality of boundary frames are sequenced according to the score to obtain the sequencing numbers corresponding to the boundary frames, so that the subsequent traversing of the boundary frames is sequentially performed according to the sequencing numbers, and when the overlapping degree of the reference boundary frame and the adjacent boundary frame is calculated, only the adjacent boundary frames with the sequencing numbers larger or smaller than the reference boundary frame can be considered, thereby reducing the data processing amount and further improving the efficiency of target detection.
In an alternative example, the bounding box is ordered according to the score size of the bounding box, specifically including: and sorting the bounding boxes in a descending order according to the score size, wherein the sorting numbers corresponding to the bounding boxes with smaller scores are larger.
In an alternative example, traversing a plurality of bounding boxes, calculating the overlap of a reference bounding box and an adjacent bounding box, resulting in a target bounding box, includes: acquiring a boundary box with the sequencing number i in the plurality of boundary boxes as a reference boundary box, and acquiring an identification bit of the boundary box; under the condition that the identification bit of the reference boundary frame is a first identification value, acquiring the adjacent boundary frame of the reference boundary frame, and judging whether the ordering number of the adjacent boundary frame is larger than i; calculating the intersection ratio of the reference bounding box and the adjacent bounding box under the condition that the sorting number of the adjacent bounding box is determined to be larger than i; under the condition that the intersection ratio is larger than a preset threshold value, the identification position of the adjacent boundary frame is a second identification value; and acquiring the boundary box with the identification bit as the first identification value as a target boundary box.
In an alternative example, the sizes of the bounding boxes are different, and the dividing the target image into a plurality of grids includes: the target image is partitioned into a plurality of grids according to a maximum size of the sizes of the plurality of bounding boxes.
In the embodiment of the application, the target image is divided into the grids according to the maximum size of the sizes of the multiple boundary frames, so that when the adjacent boundary frames of the reference boundary frame are acquired according to the grids of the boundary frames, the situation that all boundary frames with overlapping degree larger than the preset threshold value with the reference boundary frame are not contained in the acquired adjacent boundary frames because the grid size is too small is avoided, the accuracy of acquiring the target boundary frame is improved, and the accuracy of a target detection result is further improved.
In an alternative example, the plurality of bounding boxes are the same size, and the dividing the target image into a plurality of grids includes: the target image is partitioned into a plurality of grids according to the sizes of the plurality of bounding boxes.
In an alternative example, the plurality of bounding boxes being the same size includes: the plurality of bounding boxes have the same width therebetween and the plurality of bounding boxes have the same height therebetween.
In an alternative example, determining the grid to which the plurality of bounding boxes belong includes: determining a target coordinate point, wherein the target coordinate point is any coordinate point on or in the boundary box; and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
In one optional example, the target coordinate point includes: an upper right corner coordinate point of the bounding box, an upper left corner coordinate point of the bounding box, a lower right corner coordinate point of the bounding box, a lower left corner coordinate point of the bounding box, or a center coordinate point of the bounding box.
In an alternative example, after determining the grids to which the plurality of bounding boxes belong, the method further includes: establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box;
acquiring the neighboring bounding box of the reference bounding box includes: and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
In an alternative example, before ordering the bounding boxes by their corresponding score sizes, the method further includes: the plurality of bounding boxes are determined to be a plurality of bounding boxes with scores greater than a preset score.
In the embodiment of the application, the bounding boxes with the scores larger than the preset scores are screened out, and then the bounding boxes are sequenced and traversed, so that the sequencing and traversing of the bounding boxes with the low scores are omitted, the reliability of the target detection result is ensured, and the efficiency of target detection is improved.
In an alternative example, determining the target detection result from the score of the target bounding box includes: and determining a target detection result according to a target boundary box with the score larger than a preset score.
In a second aspect, an embodiment of the present application provides an object detection apparatus, including: the acquisition unit is used for acquiring a plurality of boundary boxes on the target image and scores of the boundary boxes, wherein the scores are used for representing the confidence level of the target object contained in the boundary boxes; the dividing unit is used for dividing the target image to obtain a plurality of grids and determining grids to which the plurality of bounding boxes belong; the traversing unit is used for traversing the plurality of boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the bounding boxes, the adjacent bounding box comprises a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box does not comprise the reference bounding box; and the determining unit is used for determining a target detection result according to the score of the target boundary box.
In an alternative example, the apparatus further comprises a sorting unit, in particular for: and sequencing the bounding box according to the score of the bounding box to obtain the sequencing number corresponding to the bounding box.
In an alternative example, the sorting unit is specifically configured to: and sorting the bounding boxes in a descending order according to the score size, wherein the sorting numbers corresponding to the bounding boxes with smaller scores are larger.
In an alternative example, the traversal unit is specifically configured to: obtaining a boundary box with the sequencing number i in a plurality of boundary boxes as a reference boundary box, and simultaneously obtaining identification bits of the boundary boxes; under the condition that the identification bit of the reference boundary frame is a first identification value, acquiring the adjacent boundary frame of the reference boundary frame, and judging whether the ordering number of the adjacent boundary frame is larger than i; calculating the intersection ratio of the reference bounding box and the adjacent bounding box under the condition that the sorting number of the adjacent bounding box is determined to be larger than i; under the condition that the intersection ratio is larger than a preset threshold value, the identification position of the adjacent boundary frame is a second identification value; and acquiring the boundary box with the identification bit as the first identification value as a target boundary box.
In an alternative example, the plurality of bounding boxes differ in size, and the dividing unit is specifically configured to: the target image is partitioned into a plurality of grids according to a maximum size of the sizes of the plurality of bounding boxes.
In an alternative example, the plurality of bounding boxes are the same size, and the dividing unit is specifically configured to: the target image is partitioned into a plurality of grids according to the sizes of the plurality of bounding boxes.
In an optional example, in determining the mesh to which the plurality of bounding boxes belong, the partitioning unit is specifically configured to: determining a target coordinate point, wherein the target coordinate point is any coordinate point on or in the boundary box; and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
In one optional example, the target coordinate point includes: an upper right corner coordinate point of the bounding box, an upper left corner coordinate point of the bounding box, a lower right corner coordinate point of the bounding box, a lower left corner coordinate point of the bounding box, or a center coordinate point of the bounding box.
In an alternative example, after determining the mesh to which the plurality of bounding boxes belong, the dividing unit is further configured to: establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box;
in acquiring neighboring bounding boxes of the reference bounding box, the traversal unit is further configured to: and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
In an alternative example, the sorting unit is further configured to: before the bounding boxes are ordered according to the corresponding score sizes, the bounding boxes are determined to be a plurality of bounding boxes with scores greater than a preset score.
In an alternative example, the determining unit is specifically configured to: and determining a target detection result according to a target boundary box with the score larger than a preset score.
In a third aspect, an embodiment of the present application provides an apparatus, including:
A processor and a transmission interface;
The processor invokes the executable program code stored in the memory to cause the apparatus to perform any of the methods of the first aspect.
In an alternative example, the apparatus further comprises: the memory is coupled to the processor.
In an alternative example, the apparatus further comprises: and the image sensor is used for acquiring the target image.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium comprising program instructions which, when run on a computer, cause the computer to perform any of the methods as described in the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer or processor, cause the computer or processor to perform a method as in the first aspect or any of its possible implementations.
These and other aspects of the application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1A is a schematic diagram of generating a plurality of bounding boxes according to an embodiment of the present application;
FIG. 1B is a schematic diagram of another method for generating a plurality of bounding boxes according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a target detection method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a meshing scheme according to an embodiment of the present application;
FIG. 4A is a schematic diagram of a meshing process according to an embodiment of the present application;
FIG. 4B is a schematic diagram of determining a grid to which a bounding box belongs according to an embodiment of the present application;
FIG. 4C is a schematic diagram of a grid adjacency provided by an embodiment of the present application;
FIG. 5 is a flowchart of a method for traversing multiple bounding boxes and obtaining a target bounding box according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an index queue in a grid according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a target detection apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to enable those skilled in the art to better understand the present application, the following description is made with reference to the accompanying drawings in the embodiments of the present application.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such as a series of steps or elements. The method, system, article, or apparatus is not necessarily limited to those explicitly listed but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
When the deep learning detection network is used for target detection, the confidence and the position of the target object are determined according to the score and the position of bbox boxes generated in the target image. In order to avoid missing the target object, when bbox frames are generated, the positions of bbox frames are arranged densely, so that a plurality of boundary frames appear on the same target object, please refer to fig. 1A, fig. 1A is a schematic diagram of generating a plurality of boundary frames according to an embodiment of the present application, as shown in fig. 1A, (a), 3 frames bbox are generated on a face image, and scores corresponding to each bbox frames are different. To select the optimal box from the plurality of overlapping boxes, an NMS algorithm is introduced to suppress the low score box and screen bbox with the highest score on the target object as shown in (b) of fig. 1A as the optimal box. And finally, determining the position of the target object according to the optimal frame, and determining that the confidence of the target object to the human face image is 0.98.
Or referring to fig. 1B, fig. 1B is a schematic diagram of another method for generating a plurality of bounding boxes according to an embodiment of the present application, where as shown in (c) in fig. 1B, 3 bbox boxes are respectively generated on 2 faces, and 2 optimal boxes are selected after suppressing low-level bounding boxes according to NMS algorithm. And finally, respectively determining the positions of the two target objects according to the 2 optimal frames, wherein the confidence degrees of the two target objects for the face image are respectively 0.93 and 0.80.
In the traditional method, when the NMS algorithm is adopted to restrain low-score boundary frames and obtain optimal frames, all bbox frames are firstly ordered in descending order according to the score, and then the optimal frames are screened out through multiple IOU calculation.
The specific flow of the multiple IOU calculation is as follows:
(1) The bbox box with the highest score on the target image is selected and marked as S_ bbox.
(2) The IOU calculation is sequentially performed by S_ bbox and all the frames arranged behind the S_ bbox, and the IOU calculation formula is as follows:
IOU=(A1∩A2)/(A1∪A2)
Where A1 and A2 represent the areas of two boxes participating in the IOU calculation, respectively, and if the IOU is greater than a set threshold T, it indicates that the box is suppressed by S_ bbox, and the box is deleted from the ordering queue.
(3) If the selected S_ bbox is not the last bbox frame in the ordered queue, selecting the next frame immediately after the ordered queue as S_ bbox, and repeating the step (2).
As can be seen from the NMS algorithm principle, each selected S_ bbox needs to perform IOU calculation with all frames arranged behind, and the algorithm complexity is O (N≡2) in the worst case, so when the number of bbox frames increases, the time consumption can be multiplied by the power, the increase of the NMS time consumption can cause the reduction of the detection frame rate, and the algorithm detection effect is seriously affected.
In order to solve the above problems, referring to fig. 2, fig. 2 is a schematic flow chart of a target detection method according to an embodiment of the present application, as shown in fig. 2, the method includes the following steps:
201. and obtaining a plurality of bounding boxes on the target image and scores of the bounding boxes, wherein the scores are used for representing the confidence of the bounding boxes containing the target object.
The target image is an image to be subjected to target detection, and may be an image directly input by the input device as a target image, or may be a partial image determined according to a size or a range set by a user, among images input by the input device, as a target image. As in the conventional NMS algorithm process, a plurality of bounding boxes and scores thereof generated on the target image are firstly obtained, wherein the scores of the bounding boxes represent the confidence level of the target object contained in the bounding boxes and can be any value between 0 and 1.
202. Dividing the target image to obtain a plurality of grids, and determining grids to which the plurality of bounding boxes belong.
And dividing the target image to obtain a plurality of grids, wherein each grid corresponds to a part of the target image. The mesh may be randomly divided, may be divided according to the shape and size of the target image, and may be divided according to the position or size of the bounding box. Referring to fig. 3, fig. 3 is a schematic diagram of grid division provided in an embodiment of the present application, where as shown in (a) in fig. 3, a plurality of grids are divided according to the shape and size of the target image, so that the number of generated grids just completely covers the complete target image; or as shown in fig. 3 (b), the grids are divided according to the positions of the bounding boxes in the target image such that the generated grids completely cover the bounding boxes, while the positions where the bounding boxes are not generated are not divided. After the grid division is completed, the grid to which each bounding box belongs can be determined according to the positions generated by the bounding boxes.
In the grid division, the division may be performed in a non-fixed size, i.e., a plurality of grids divided on the target image have different widths and heights. The grids may also be divided in a fixed size, i.e. multiple grids divided on the target image have the same width and height.
When the grids are divided according to the fixed size, the fixed size of the grid may be determined in the case of determining the target image size and the number of grids; the number of divided meshes may be determined in the case where the target image size and the mesh fixed size are determined. For the former case, for example, the target image size is known to be 3cm×2cm, the number of grids is 3*2, and the fixed size of the grids is known to be 1cm×1cm; or the number of the grids is 3 x 4, and then the fixed size of the grids is 1cm x 0.5cm; for the latter case, for example, the target image size is 8cm×6cm, the mesh size is 2cm×2cm, then the number of transverse meshes nw=ceil (8 cm/2 cm) =4, the number of longitudinal meshes nh=ceil (6 cm/2 cm) =3, and the total number of meshes 4*3 =12 is known. When the grid number is calculated, the ceil function is used for representing the upward rounding, namely, a grid is also generated for the image which finally cannot meet the size of a grid size in the target image. In some cases, the number of grids can be calculated by adopting a floor function to round downwards, and a round function to round up; because the probability of generating a bounding box in an image smaller than one grid size is lower, or the score corresponding to generating a bounding box is also lower, the calculation of the overlap of bounding boxes in the portion of the target image may be omitted.
Optionally, the sizes of the plurality of bounding boxes are different, and dividing the target image to obtain a plurality of grids includes: the target image is divided into a plurality of grids according to the maximum size of the sizes of the plurality of bounding boxes.
The different sizes of the bounding boxes means that the bounding boxes have different widths and heights. In the mesh division according to the fixed size, if the sizes of the plurality of bounding boxes are different, the largest size among the sizes of the plurality of bounding boxes may be taken as the fixed size of the mesh division. This may be such that all neighboring bounding boxes can be covered when the neighboring bounding box of the reference bounding box is acquired.
Optionally, the sizes of the plurality of bounding boxes are the same, and dividing the target image to obtain a plurality of grids includes: and dividing the target image into a plurality of grids according to the sizes of the plurality of bounding boxes.
The plurality of bounding boxes are the same size, meaning that the plurality of bounding boxes have the same width and height. In the mesh division according to the fixed size, if the sizes of the plurality of bounding boxes are the same, the sizes of the plurality of bounding boxes may be taken as the fixed size of the mesh division.
Specifically, referring to fig. 4A, fig. 4A is a schematic diagram of a grid dividing process according to an embodiment of the present application, where, as shown in fig. 4A, a height H and a width W of a target image are obtained, and then a height Hb and a width Wb of a bounding box are used as fixed dimensions of a divided grid. Illustratively, the number of grids in the width direction is nw=ceil (W/Wb), and the number of grids in the height direction is nh=ceil (H/Hb). ceil represents an upward rounding, and each cell on the right in fig. 4A is a divided grid.
Optionally, determining the grid to which the plurality of bounding boxes belong includes: determining a target coordinate point, wherein the target coordinate point is any coordinate point on or in the boundary box; and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
Optionally, the target coordinate point includes: an upper right corner coordinate point of the bounding box, an upper left corner coordinate point of the bounding box, a lower right corner coordinate point of the bounding box, a lower left corner coordinate point of the bounding box, or a center coordinate point of the bounding box.
After the grid division of the target image is completed, determining the grid to which the boundary box belongs according to the positions of the boundary boxes. The bounding box has an area and different positions of the bounding box may be located in different grids. The mesh to which the bounding box belongs may be determined on the bounding box or in the bounding box with the mesh to which any one of the coordinate points belongs. Or determining grids of the plurality of coordinate points on or in the boundary box, and then determining the grids of the boundary box according to the grids of the coordinate points with the largest number of the plurality of coordinate points.
Taking an example of determining a grid to which a bounding box belongs by using a grid to which any coordinate point on or in the bounding box belongs as a boundary box, referring to fig. 4B, fig. 4B is a schematic diagram of determining a grid to which a bounding box belongs according to an embodiment of the present application, as shown in fig. 4B, the generated grids are numbered first, and then the grid number to which each bounding box belongs is determined. Taking a bounding box with a score of 0.84 as an example, assuming that the coordinate point of the upper left corner of the bounding box is (Xmin, ymax), the following formula may be used for determining the grid number to which the bounding box belongs:
Iw=floor(Xmin/Wb)
Ih=floor(Ymax/Hb)
Iw represents the grid number corresponding to the boundary box in width, ih represents the grid number corresponding to the boundary box in height, namely, the grid number R_ih_Iw of the boundary box is determined according to the grid to which the coordinate point at the upper left corner belongs.
It can be seen that the grids in fig. 4B are ordered from 0, so the above formula for determining the grid to which the bounding box belongs is rounded down using the floor function. When the grids are ordered from 1, the formula for determining the grid to which the bounding box belongs may be rounded up using the ceil function.
Similarly, the upper right corner coordinate point (Xmax, ymax) or the lower right corner coordinate point (Xmax, ymin) or the center coordinate point (Xmed, ymed) of the bounding box may be obtained, and then the grids to which these coordinate points belong may be calculated according to the above formula, thereby determining the grids to which the bounding box belongs.
203. Traversing the multiple boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the bounding boxes, the adjacent bounding box comprises a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box does not comprise the reference bounding box.
After determining the grids to which each bounding box belongs, the relationship between bounding boxes can be determined from the grids to which the bounding boxes belong and the relationship between grids. In the embodiment of the present application, the relationship between bounding boxes mainly refers to a neighboring or non-neighboring relationship.
Alternatively, the neighboring mesh of the target mesh may be determined from the eight neighbor mesh of the target mesh. An eight neighborhood mesh of a mesh refers to all meshes that have edges or vertices coincident with the mesh. Referring to fig. 4C, fig. 4C is a schematic diagram of a grid adjacency relationship provided in an embodiment of the present application, where, as shown in (a) in fig. 4C, grid r_1_1 is used as a target grid, and a filling grid is an eight-neighborhood grid of the target grid, that is, an adjacent grid of the target grid.
Alternatively, adjacent grids of the target grid may be determined according to the four-neighbor grids of the target grid. The four-neighbor grid of the grid refers to all grids which are overlapped with the grid by edges. As shown in (b) of fig. 4C, the filled mesh is a four-neighbor mesh of the target mesh r_1_1, that is, a neighboring mesh of the target mesh.
Alternatively, adjacent grids of the target grid may be determined based on a center distance from the target grid. For example, a grid having a center at a distance from the center of the target grid that is not greater than a preset distance may be used as an adjacent grid of the target grid, and the preset distance may be a side length of the target grid, or a multiple side length of the target grid, or other values. For example, (C) in fig. 4C, assuming that the preset distance is the side length of the target mesh r_1_1, the distances between the centers of the meshes r_0_1, r_1_0, r_1_2 and r_2_1 and the center of the target mesh are equal to the preset distance, the condition that the center-to-center distance of the mesh is not greater than the preset distance is satisfied, and thus the four meshes are adjacent meshes of the target mesh.
After determining the neighboring mesh of the target mesh, the neighboring bounding box of the reference bounding box may be determined. The bounding boxes of the same grid as the reference bounding box and the bounding boxes of adjacent grids of the reference bounding box are all adjacent bounding boxes of the reference bounding box. Wherein the neighboring bounding box of the reference bounding box does not include itself.
Optionally, before traversing the plurality of bounding boxes, the method further comprises: and sequencing the bounding boxes according to the score of the bounding boxes to obtain sequencing numbers corresponding to the bounding boxes.
After generating a plurality of bounding boxes on the target image, each bounding box also has its corresponding score, and the score of the bounding box may be any value between 0 and 1, which is used to represent the confidence that the bounding box contains the target object to be detected. Because the high-score boundary frames inhibit the low-score boundary frames, the boundary frames are ordered according to the score, after the ordering numbers corresponding to the boundary frames are obtained, the boundary frames are traversed according to the ordering numbers, whether the reference boundary frames need to be overlapped with the adjacent boundary frames or not can be determined according to the ordering numbers of the reference boundary frames and the adjacent boundary frames, and the traversing efficiency can be effectively improved.
The scores can be arranged in descending order from large to small, namely, the higher the score is, the smaller the sequencing number of the corresponding bounding box is; the ranking may also be performed in ascending order from small to large score, i.e., the higher the score, the greater the ranking number of the corresponding bounding box. Taking the descending order as an example, a plurality of bounding boxes in fig. 1B, the corresponding sorting numbers are shown in table 1:
TABLE 1
| Ranking number | Bounding box score |
| 1 | 0.93 |
| 2 | 0.84 |
| 3 | 0.80 |
| 4 | 0.62 |
| 5 | 0.51 |
| 6 | 0.31 |
Optionally, before ordering the bounding boxes by their corresponding score sizes, the method further comprises: and determining the plurality of bounding boxes as a plurality of bounding boxes with scores greater than a preset score.
In the above procedure, the bounding box and the score corresponding to the bounding box are generated on the target image. These scores are arbitrary values between 0 and 1, but when the score of the bounding box is smaller than the preset score, the confidence that the bounding box contains the target object to be detected is sufficiently low, and these bounding boxes, if not suppressed, cannot be used to determine that the bounding box contains the target object. Therefore, bounding boxes with scores lower than the preset score can be directly screened out. For example, the preset score may be 0.8, and then the bounding boxes with scores 0.61,0.51 and 0.31 are filtered out in fig. 1B, leaving only the bounding boxes with scores of 0.93, 0.84, and 0.80. The corresponding ranking numbers are shown in table 2:
TABLE 2
| Ranking number | Bounding box score |
| 1 | 0.93 |
| 2 | 0.84 |
| 3 | 0.80 |
The process can effectively reduce the data volume for the subsequent overlap calculation, and improves the target detection efficiency. After all or part of the above steps are completed, a plurality of bounding boxes can be traversed to obtain a target bounding box, please refer to fig. 5, fig. 5 is a flowchart of a method for traversing a plurality of bounding boxes to obtain a target bounding box, which specifically includes the following steps:
501. The plurality of bounding boxes are N bounding boxes, and the identification bits of the N bounding boxes are initialized to 0;
502. Setting i to 0;
503. Acquiring a boundary box with a sequencing number i as a reference boundary box, and acquiring an identification bit of the boundary box; in case the identification bit of the reference bounding box is 1, step 504 is performed; in the case that the identification bit of the reference bounding box is 0, step 505 is performed;
504. Increment i by 1 and determine if i is less than N; when it is determined that i is less than N, step 503 is performed; upon determining that i is not less than N, step 511 is performed;
505. acquiring adjacent bounding boxes of the reference bounding box;
506. judging whether the sorting number of the adjacent bounding boxes is larger than i, if so, executing step 507, and if not, executing step 509;
507. calculating the intersection ratio of the reference bounding box and the adjacent bounding box, determining whether the intersection ratio is greater than a preset threshold, if not, executing step 509; if yes, go to step 508;
508. Step 509 is executed by setting the identification position of the adjacent bounding box to 1;
509. Determining whether the adjacent bounding box is the last adjacent bounding box of the reference bounding box, if so, executing step 504, otherwise, executing step 510;
510. Selecting a next adjacent bounding box, executing step 506;
511. and finishing the traversal, and acquiring the boundary box with the identification bit of 0 as a target boundary box.
In this process, N is the total number of bounding boxes obtained from the target image. The identification bits of the bounding box are used to characterize whether the bounding box is suppressed, wherein a first identification value is used to indicate that the bounding box is not suppressed and a second identification value is used to indicate that the bounding box is suppressed. In the initial state, the identification bits of all bounding boxes are set to a first identification value, which in the example of fig. 5 is 0, indicating that all bounding boxes are not suppressed. It should be appreciated that the initial identification bits of the bounding box may also be other values or characters. Further, the reference bounding boxes are acquired in the order of their sorting numbers, as in the example of fig. 5, the bounding boxes are sorted from 0, i is set to 0 first, and then the bounding box with the sorting number of 0 is acquired as the first reference bounding box. If the bounding box starts ordering numbers from 1, i is first set to 1, and then the bounding box ordering number 1 is acquired as the first reference bounding box. That is, the initial value of i is determined by the bounding box first sort number. And then, acquiring the identification bit of the reference boundary frame, wherein the identification bit of the first reference boundary frame is 0, acquiring the adjacent boundary frame of the reference boundary frame, and calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, wherein the overlapping degree represents the ratio of the overlapping area of the reference boundary frame and the adjacent boundary frame, and can be the ratio of the overlapping area to the area of the reference boundary frame, the ratio of the overlapping area to the area of the adjacent boundary frame, or the ratio of the overlapping area to the combined area of the reference boundary frame and the adjacent boundary frame. The overlapping degree may be represented by an overlap ratio, that is, calculated by an IOU calculation formula, if the value of the IOU is greater than a preset threshold, which indicates that the adjacent bounding box is suppressed by the reference bounding box, and the identification position of the adjacent bounding box is set to a second identification value, which is 1 in the example of fig. 5. And then acquiring the next adjacent boundary frame of the reference boundary frame, and under the condition that the sequence number of the adjacent boundary frame is larger than that of the reference boundary frame and the identification bit of the adjacent boundary frame is the first identification value, performing IOU calculation on the reference boundary frame and the adjacent boundary frame until the fact that the adjacent boundary frame of the reference boundary frame is completely subjected to IOU calculation and corresponding identification bit modification is determined. And selecting the next bounding box as a reference bounding box according to the sorting number of the bounding box, and continuing to calculate the overlapping degree of the reference bounding box and the adjacent bounding box until the traversing of the bounding box with the Nth sorting number is completed. On the other hand, since the N-th sorted-numbered bounding box does not have an adjacent bounding box with a larger sorted number than it, the traversal process of the N-th sorted-numbered bounding box can be omitted, and thus the traversal can be ended after the N-1-th bounding box is traversed.
After traversing the plurality of bounding boxes, the identification bits of the plurality of bounding boxes are modified primarily such that the identification bits of some suppressed bounding boxes are modified to 1, while the identification bits of non-suppressed bounding boxes remain to 0. The bounding box with the identification bit of 0 is taken as the target bounding box, namely the target bounding box is an uninhibited bounding box in a plurality of bounding boxes.
Optionally, after determining the grids to which the plurality of bounding boxes belong, the method further includes: establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box; acquiring the neighboring bounding box of the reference bounding box includes: and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
In the above process, after determining the grid of the bounding boxes, before traversing the bounding boxes, an index queue may be established for the bounding boxes belonging to the grid, and the order of the index queues may be determined according to the ranking numbers of the bounding boxes. Referring to fig. 6, fig. 6 is a schematic diagram of an index queue in a grid according to an embodiment of the present application, as shown in fig. 6, a grid to which each bounding box belongs is sequentially determined according to a sorting number of the bounding box, and after determining that the bounding box belongs to a target grid, the sorting number of the bounding box is added to the index queue corresponding to the grid. In fig. 6, index queues of grids r_1_0, r_1_2 and r_2_2 are generated, the index queues include sorting numbers of one or more bounding boxes, the sorting numbers are sequentially arranged, and the index queues may further include scores of bounding boxes corresponding to each sorting number. After generating index queues of the boundary frames contained in the grids, when traversing the plurality of boundary frames to obtain adjacent boundary frames corresponding to the reference boundary frame, the adjacent boundary frames can be obtained according to the target grids to which the reference boundary frame belongs and the index queues corresponding to the adjacent grids of the target grids.
205. And determining a target detection result according to the score of the target boundary box.
The target bounding box obtained according to the above steps is not suppressed by other bounding boxes and can be used to determine a target detection result. As shown in (d) of fig. 1B, the bounding box corresponding to 0.93 and 0.80 is the target bounding box, and the target detection result is determined according to the score of the target bounding box, that is, the score of the target bounding box indicates the confidence that the target object is included in the bounding box, and the greater the confidence, the greater the probability that the target object is included is indicated. For example, in the embodiment of the present application, the target object corresponding to the target detection is a face image, the score of the target bounding box 1 is 0.80, the confidence coefficient indicating that the face image is included in the range of the target bounding box 1 is 0.80, and the obtained target detection result may be: the probability that the target bounding box 1 includes a face image is 80%. If it is preset that when the bounding box score is greater than 0.7, it may be determined that the bounding box includes the target object, then the target detection result may be: the target bounding box 1 includes a face image therein.
Optionally, determining the target detection result according to the score of the target bounding box includes: and determining a target detection result according to a target boundary box with the score larger than a preset score.
If a plurality of target boundary frames are obtained after traversing the plurality of boundary frames and the target boundary frames with the score smaller than the preset score are included, when the target boundary frames are used for determining the target detection result, the probability of including the target object in the boundary frames is lower than the required probability, the target boundary frames can be directly screened out, the target detection result is determined only according to the target boundary frames with the score larger than the preset score, and the efficiency of generating the target detection result is improved. Or the target detection result may be determined according to a target bounding box with a score equal to a preset score.
In the embodiment of the application, the target image is divided to obtain a plurality of grids; determining adjacent boundary frames corresponding to the boundary frames according to the grid division result; and traversing the multiple bounding boxes, calculating to obtain a target bounding box which is not inhibited by the adjacent bounding box in the multiple bounding boxes, and finally determining a target detection result according to the score of the target bounding box. In the process, grids of the boundary frames are determined, and then adjacent boundary frames of the boundary frames are determined, so that only the overlapping degree of the boundary frames and the adjacent boundary frames is required to be calculated when the boundary frames are traversed, the data processing amount is greatly reduced, the data processing efficiency is improved, and the target detection efficiency is further improved.
Referring to fig. 7, in an object detection device according to an embodiment of the present application, as shown in fig. 7, the device 700 includes:
An obtaining unit 701, configured to obtain a plurality of bounding boxes on a target image and scores of the plurality of bounding boxes, where the scores are used to characterize a confidence level of a target object contained in the bounding boxes;
a dividing unit 702, configured to divide the target image to obtain a plurality of grids, and determine grids to which the plurality of bounding boxes belong;
a traversing unit 703, configured to traverse the multiple bounding boxes, calculate an overlapping degree of the reference bounding box and the adjacent bounding box, and obtain a target bounding box according to the overlapping degree; the reference bounding box is any one of the plurality of bounding boxes, the adjacent bounding boxes comprise bounding boxes belonging to a target grid and bounding boxes belonging to adjacent grids of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding boxes do not comprise the reference bounding box;
and a determining unit 704, configured to determine a target detection result according to the score of the target bounding box.
According to the device provided by the embodiment of the application, the target image is divided to obtain a plurality of grids; determining adjacent boundary frames corresponding to the boundary frames according to the grid division result; and traversing the multiple bounding boxes, calculating to obtain a target bounding box which is not inhibited by the adjacent bounding box in the multiple bounding boxes, and finally determining a target detection result according to the score of the target bounding box. In the process, grids of the boundary frames are determined, and then adjacent boundary frames of the boundary frames are determined, so that only the overlapping degree of the boundary frames and the adjacent boundary frames is required to be calculated when the boundary frames are traversed, the data processing amount is greatly reduced, the data processing efficiency is improved, and the target detection efficiency is further improved.
In an alternative example, the apparatus further comprises a sorting unit 705, in particular for:
And sequencing the bounding boxes according to the score of the bounding box to obtain sequencing numbers corresponding to the bounding boxes.
In an alternative example, the sorting unit 705 is specifically configured to:
And sorting the bounding boxes in descending order according to the score size, wherein the sorting numbers corresponding to the bounding boxes with smaller scores are larger.
In an alternative example, the traversing unit 703 is specifically configured to:
Acquiring a boundary box with an ordering number i in the plurality of boundary boxes as a reference boundary box, and acquiring an identification bit of the boundary box;
acquiring adjacent bounding boxes of the reference bounding box and judging whether the sorting number of the adjacent bounding box is larger than i under the condition that the identification bit of the reference bounding box is a first identification value;
calculating the intersection ratio of the reference bounding box and the adjacent bounding box under the condition that the sorting number of the adjacent bounding box is determined to be larger than i;
under the condition that the intersection ratio is larger than a preset threshold value, the identification position of the adjacent boundary frame is a second identification value;
And acquiring a boundary box with the identification bit as the first identification value as a target boundary box.
In an alternative example, the sizes of the bounding boxes are different, and the dividing unit 702 is specifically configured to:
and dividing the target image according to the maximum size in the sizes of the plurality of bounding boxes to obtain a plurality of grids.
In an alternative example, the plurality of bounding boxes have the same size, and the dividing unit 702 is specifically configured to:
And dividing the target image according to the sizes of the plurality of bounding boxes to obtain a plurality of grids.
In an optional example, in determining the grids to which the bounding boxes belong, the dividing unit 702 is specifically configured to:
Determining a target coordinate point, wherein the target coordinate point is any coordinate point on the boundary box or in the boundary box;
and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
In an alternative example, the target coordinate point includes:
The upper right corner coordinate point of the boundary box, the upper left corner coordinate point of the boundary box, the lower right corner coordinate point of the boundary box, the lower left corner coordinate point of the boundary box or the center coordinate point of the boundary box.
In an optional example, after determining the grids to which the bounding boxes belong, the dividing unit 702 is further configured to:
Establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box;
in terms of the acquiring the neighboring bounding box of the reference bounding box, the traversing unit 703 is further configured to:
and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
In an alternative example, the sorting unit 705 is further configured to:
before the bounding boxes are ordered according to the corresponding score sizes, the bounding boxes are determined to be a plurality of bounding boxes with scores greater than a preset score.
In an alternative example, the determining unit 704 is specifically configured to:
And determining a target detection result according to a target boundary box with the score larger than a preset score.
In another embodiment of the application, referring to fig. 8, an apparatus 800 comprises at least one processor 801, at least one memory 802, and at least one communication interface 803, and further comprises an image sensor 804 and a display 805. The processor 801, the memory 802, the communication interface 803, the image sensor 804, and the display 805 are connected via the communication bus and perform communication with each other.
The device 800 can be used in intelligent equipment such as intelligent access control and intelligent security protection. The device 800 may collect images through the image sensor 804, or connect with other communication devices or readable memory through the communication interface 803 to obtain image data, transmit the image data to the processor 801 for target detection, and the detected result will generally be subjected to post-processing (such as result classification, scoring, screening, identification, etc.), and then output the final result to the display 805 for display or storage in the memory 802.
The processor 801 may be a general purpose central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the above programs.
Communication interface 803 for optical fiber communication with other devices or communication networks.
The Memory 802 may be, but is not limited to, a read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM), a compact disc read-Only Memory (Compact Disc Read-Only Memory) or other optical disc storage, a compact disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be stand alone and coupled to the processor via a bus. The memory may also be integrated with the processor.
The memory 802 is used for storing application program codes and program execution results for executing the above schemes, and is controlled to be executed by the processor 801. The processor 801 is configured to execute application code stored in the memory 802.
The code stored in the memory 802 may perform the object detection methods provided above, such as:
Obtaining a plurality of boundary boxes on a target image and scores of the boundary boxes, wherein the scores are used for representing the confidence level of the target object contained in the boundary boxes;
dividing the target image to obtain a plurality of grids, and determining grids to which the plurality of bounding boxes belong;
Traversing the multiple boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the plurality of bounding boxes, the adjacent bounding boxes comprise bounding boxes belonging to a target grid and bounding boxes belonging to adjacent grids of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding boxes do not comprise the reference bounding box;
And determining a target detection result according to the score of the target boundary box.
The apparatus 800 in the embodiment of the present application may be implemented by a complex Programmable logic device (Complex Programmable Logic Device, CPLD), a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), or the like, which is not limited in this embodiment of the present application.
Embodiments of the present application also provide a computer-readable storage medium having instructions stored therein, which when run on a computer or processor, cause the computer or processor to perform one or more steps of any of the methods described above. The respective constituent modules of the above-described signal processing apparatus may be stored in the computer-readable storage medium if implemented in the form of software functional units and sold or used as independent products.
Based on such understanding, embodiments of the present application also provide a computer program product containing instructions that, when run on a computer or processor, cause the computer or processor to perform any of the methods provided by the embodiments of the present application. The aspects of the present application, in essence, or portions of the prior art or all or part of the aspects, may be embodied in the form of a software product stored on a storage medium, including instructions for causing a computer device or processor therein to perform all or part of the steps of the methods described in various embodiments of the application.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional manners of dividing the actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. Those skilled in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, and the program may be stored in a computer readable memory, such as the foregoing memory 802, which is not described herein.
The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will appreciate, modifications will be made in the specific embodiments and application scope in accordance with the idea of the present application, and the present disclosure should not be construed as limiting the present application.
Claims (23)
1. A method of target detection, the method comprising:
Obtaining a plurality of boundary boxes on a target image and scores of the boundary boxes, wherein the scores are used for representing the confidence level of the target object contained in the boundary boxes;
dividing the target image to obtain a plurality of grids, and determining grids to which the plurality of bounding boxes belong;
Traversing the multiple boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the plurality of bounding boxes, the adjacent bounding boxes comprise bounding boxes belonging to a target grid and bounding boxes belonging to adjacent grids of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding boxes do not comprise the reference bounding box;
And determining a target detection result according to the score of the target boundary box.
2. The method of claim 1, wherein prior to traversing the plurality of bounding boxes, the method further comprises:
And sequencing the bounding boxes according to the score of the bounding box to obtain sequencing numbers corresponding to the bounding boxes.
3. The method according to claim 2, wherein said sorting said bounding boxes according to their score size, in particular comprises:
And sorting the bounding boxes in descending order according to the score size, wherein the sorting numbers corresponding to the bounding boxes with smaller scores are larger.
4. A method according to claim 3, wherein traversing the plurality of bounding boxes, calculating the degree of overlap of a reference bounding box with an adjacent bounding box, and deriving a target bounding box from the degree of overlap, comprises:
Acquiring a boundary box with an ordering number i in the plurality of boundary boxes as a reference boundary box, and acquiring an identification bit of the boundary box;
acquiring adjacent bounding boxes of the reference bounding box and judging whether the sorting number of the adjacent bounding box is larger than i under the condition that the identification bit of the reference bounding box is a first identification value;
calculating the intersection ratio of the reference bounding box and the adjacent bounding box under the condition that the sorting number of the adjacent bounding box is determined to be larger than i;
under the condition that the intersection ratio is larger than a preset threshold value, the identification position of the adjacent boundary frame is a second identification value;
And acquiring a boundary box with the identification bit as the first identification value as a target boundary box.
5. The method according to any one of claims 1-4, wherein the plurality of bounding boxes are different in size, and the dividing the target image into a plurality of grids comprises:
and dividing the target image according to the maximum size in the sizes of the plurality of bounding boxes to obtain a plurality of grids.
6. The method according to any one of claims 1-4, wherein the plurality of bounding boxes are the same size, and the dividing the target image into a plurality of grids comprises:
And dividing the target image according to the sizes of the plurality of bounding boxes to obtain a plurality of grids.
7. The method of claim 1, wherein the determining the grid to which the plurality of bounding boxes belong comprises:
Determining a target coordinate point, wherein the target coordinate point is any coordinate point on the boundary box or in the boundary box;
and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
8. The method of claim 7, wherein the target coordinate point comprises:
The upper right corner coordinate point of the boundary box, the upper left corner coordinate point of the boundary box, the lower right corner coordinate point of the boundary box, the lower left corner coordinate point of the boundary box or the center coordinate point of the boundary box.
9. The method of claim 2, wherein after determining the grid to which the plurality of bounding boxes belong, the method further comprises:
Establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box;
the acquiring the adjacent bounding box of the reference bounding box comprises:
and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
10. A method according to claim 2 or 3, wherein before ordering the bounding boxes according to their score size, the method further comprises:
And determining the plurality of bounding boxes as a plurality of bounding boxes with scores greater than a preset score.
11. The method of claim 1, wherein determining a target detection result based on the score of the target bounding box comprises:
And determining a target detection result according to a target boundary box with the score larger than a preset score.
12. An object detection device, the device comprising:
the acquisition unit is used for acquiring a plurality of bounding boxes on the target image and scores of the bounding boxes, wherein the scores are used for representing the confidence level of the target object contained in the bounding boxes;
The dividing unit is used for dividing the target image to obtain a plurality of grids and determining grids to which the plurality of bounding boxes belong;
the traversing unit is used for traversing the plurality of boundary frames, calculating the overlapping degree of the reference boundary frame and the adjacent boundary frame, and obtaining a target boundary frame according to the overlapping degree; the reference bounding box is any one of the plurality of bounding boxes, the adjacent bounding boxes comprise bounding boxes belonging to a target grid and bounding boxes belonging to adjacent grids of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding boxes do not comprise the reference bounding box;
And the determining unit is used for determining a target detection result according to the score of the target boundary box.
13. The apparatus according to claim 12, characterized in that the apparatus further comprises a sorting unit, in particular for:
And sequencing the bounding boxes according to the score of the bounding box to obtain sequencing numbers corresponding to the bounding boxes.
14. The apparatus according to claim 13, wherein the sorting unit is specifically configured to:
And sorting the bounding boxes in descending order according to the score size, wherein the sorting numbers corresponding to the bounding boxes with smaller scores are larger.
15. The apparatus of claim 14, wherein the traversing unit is specifically configured to:
Acquiring a boundary box with an ordering number i in the plurality of boundary boxes as a reference boundary box, and acquiring an identification bit of the boundary box;
acquiring adjacent bounding boxes of the reference bounding box and judging whether the sorting number of the adjacent bounding box is larger than i under the condition that the identification bit of the reference bounding box is a first identification value;
calculating the intersection ratio of the reference bounding box and the adjacent bounding box under the condition that the sorting number of the adjacent bounding box is determined to be larger than i;
under the condition that the intersection ratio is larger than a preset threshold value, the identification position of the adjacent boundary frame is a second identification value;
And acquiring a boundary box with the identification bit as the first identification value as a target boundary box.
16. The apparatus according to any one of claims 12-15, wherein the plurality of bounding boxes are different in size, the dividing unit being specifically configured to:
and dividing the target image according to the maximum size in the sizes of the plurality of bounding boxes to obtain a plurality of grids.
17. The apparatus according to any one of claims 12-15, wherein the plurality of bounding boxes are of the same size, the dividing unit being specifically configured to:
And dividing the target image according to the sizes of the plurality of bounding boxes to obtain a plurality of grids.
18. The apparatus according to claim 12, wherein the dividing unit is specifically configured to:
Determining a target coordinate point, wherein the target coordinate point is any coordinate point on the boundary box or in the boundary box;
and determining the grid to which the target coordinate point belongs as the grid to which the boundary box belongs.
19. The apparatus of claim 18, wherein the target coordinate point comprises:
The upper right corner coordinate point of the boundary box, the upper left corner coordinate point of the boundary box, the lower right corner coordinate point of the boundary box, the lower left corner coordinate point of the boundary box or the center coordinate point of the boundary box.
20. The apparatus of claim 13, wherein after determining the grid to which the plurality of bounding boxes belong, the partitioning unit is further to:
Establishing an index queue corresponding to the grid according to the grid to which the boundary box belongs, wherein the index queue comprises at least one sequencing number of the boundary box;
In the acquiring of the neighboring bounding boxes of the reference bounding box, the traversing unit is further configured to:
and acquiring adjacent boundary frames from the target grid and adjacent grids of the target grid according to the index queue.
21. The apparatus according to claim 13 or 14, wherein the sorting unit is further configured to:
before the bounding boxes are ordered according to the corresponding score sizes, the bounding boxes are determined to be a plurality of bounding boxes with scores greater than a preset score.
22. The apparatus according to claim 12, wherein the determining unit is specifically configured to:
And determining a target detection result according to a target boundary box with the score larger than a preset score.
23. An apparatus, comprising: a processor and a transmission interface;
the processor invokes executable program code stored in a memory to perform the method of any of claims 1-11.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911026844.2A CN112711972B (en) | 2019-10-26 | 2019-10-26 | Target detection method and device |
| PCT/CN2020/108964 WO2021077868A1 (en) | 2019-10-26 | 2020-08-13 | Target detection method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911026844.2A CN112711972B (en) | 2019-10-26 | 2019-10-26 | Target detection method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112711972A CN112711972A (en) | 2021-04-27 |
| CN112711972B true CN112711972B (en) | 2024-06-14 |
Family
ID=75541559
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201911026844.2A Active CN112711972B (en) | 2019-10-26 | 2019-10-26 | Target detection method and device |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN112711972B (en) |
| WO (1) | WO2021077868A1 (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11238314B2 (en) * | 2019-11-15 | 2022-02-01 | Salesforce.Com, Inc. | Image augmentation and object detection |
| CN113705643B (en) * | 2021-08-17 | 2022-10-28 | 荣耀终端有限公司 | Target detection method and device and electronic equipment |
| CN114782346B (en) * | 2022-04-13 | 2024-10-29 | 大连理工大学 | Cloth image defect detection method based on polymorphic data amplification and block identification |
| CN116167321B (en) * | 2023-01-19 | 2024-07-26 | 深圳华大九天科技有限公司 | Protection ring generation method, device and related products |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109145756A (en) * | 2018-07-24 | 2019-01-04 | 湖南万为智能机器人技术有限公司 | Object detection method based on machine vision and deep learning |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9582895B2 (en) * | 2015-05-22 | 2017-02-28 | International Business Machines Corporation | Real-time object analysis with occlusion handling |
| CN108022238B (en) * | 2017-08-09 | 2020-07-03 | 深圳科亚医疗科技有限公司 | Method, computer storage medium, and system for detecting object in 3D image |
| US10867393B2 (en) * | 2018-03-22 | 2020-12-15 | Texas Instruments Incorporated | Video object detection |
| CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
| CN109883400B (en) * | 2018-12-27 | 2021-12-10 | 南京国图信息产业有限公司 | Automatic target detection and space positioning method for fixed station based on YOLO-SITCOL |
| CN110059548B (en) * | 2019-03-08 | 2022-12-06 | 北京旷视科技有限公司 | Target detection method and device |
| CN110084173B (en) * | 2019-04-23 | 2021-06-15 | 精伦电子股份有限公司 | Human head detection method and device |
-
2019
- 2019-10-26 CN CN201911026844.2A patent/CN112711972B/en active Active
-
2020
- 2020-08-13 WO PCT/CN2020/108964 patent/WO2021077868A1/en not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109145756A (en) * | 2018-07-24 | 2019-01-04 | 湖南万为智能机器人技术有限公司 | Object detection method based on machine vision and deep learning |
Non-Patent Citations (1)
| Title |
|---|
| "基于 YOLO v2 模型的交通标识检测算法";王 超 等;《计算机应用》;第276-278页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112711972A (en) | 2021-04-27 |
| WO2021077868A1 (en) | 2021-04-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112711972B (en) | Target detection method and device | |
| US11594070B2 (en) | Face detection training method and apparatus, and electronic device | |
| CN109541634B (en) | Path planning method and device and mobile device | |
| JP6871314B2 (en) | Object detection method, device and storage medium | |
| CN113219992A (en) | Path planning method and cleaning robot | |
| US9959670B2 (en) | Method for rendering terrain | |
| Cheng et al. | Cross-trees, edge and superpixel priors-based cost aggregation for stereo matching | |
| CN104899853A (en) | Image region dividing method and device | |
| US20210217234A1 (en) | Device and method for extracting terrain boundary | |
| CN110264405A (en) | Image processing method, device, server and storage medium based on interpolation algorithm | |
| CN106407920A (en) | Stripe noise elimination method of fingerprint image | |
| CN107240146A (en) | The method and apparatus that operational indicator is shown | |
| CN112819700A (en) | Denoising method and device for point cloud data and readable storage medium | |
| CN109389110A (en) | A kind of area determination method and device | |
| KR102068745B1 (en) | A method and apparatus for segmenting a grid map into a plurality of rooms | |
| CN113448667B (en) | Method and device for generating display relationship diagram | |
| CN115147442A (en) | Grid pattern vectorization method, mobile terminal, electronic device, and medium | |
| JP2021156879A (en) | Fracture surface analysis device, fracture surface analysis method, and machine learning data set generation method | |
| CN119048834A (en) | Corn quality identification method, device, computer equipment and storage medium | |
| CN118196316A (en) | Pavement construction method, device, equipment and storage medium | |
| CN113117334B (en) | Method and related device for determining visible area of target point | |
| CN115690364A (en) | AR model acquisition method, electronic device and readable storage medium | |
| CN115880454B (en) | Planar feature detection method, device and equipment for live-action three-dimensional model | |
| CN113628335A (en) | Point cloud map construction method and device and computer readable storage medium | |
| CN108510575B (en) | Method for determining shielding relation of large object on 2D game oblique 45-degree map |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB02 | Change of applicant information | ||
| CB02 | Change of applicant information |
Address after: Room 101, No. 2 Hongqiaogang Road, Qingpu District, Shanghai, 201721 Applicant after: Haisi Technology Co.,Ltd. Address before: Room 101, 318 Shuixiu Road, Jinze town (xicen), Qingpu District, Shanghai, 201799 Applicant before: Shanghai Haisi Technology Co.,Ltd. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant |