[go: up one dir, main page]

CN111626208B - Method and device for detecting small objects - Google Patents

Method and device for detecting small objects Download PDF

Info

Publication number
CN111626208B
CN111626208B CN202010461384.2A CN202010461384A CN111626208B CN 111626208 B CN111626208 B CN 111626208B CN 202010461384 A CN202010461384 A CN 202010461384A CN 111626208 B CN111626208 B CN 111626208B
Authority
CN
China
Prior art keywords
detection model
training
small target
image
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010461384.2A
Other languages
Chinese (zh)
Other versions
CN111626208A (en
Inventor
何刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Intelligent Connectivity Beijing Technology Co Ltd
Original Assignee
Apollo Intelligent Connectivity Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apollo Intelligent Connectivity Beijing Technology Co Ltd filed Critical Apollo Intelligent Connectivity Beijing Technology Co Ltd
Priority to CN202010461384.2A priority Critical patent/CN111626208B/en
Publication of CN111626208A publication Critical patent/CN111626208A/en
Priority to JP2021051677A priority patent/JP7262503B2/en
Priority to KR1020210040639A priority patent/KR102523886B1/en
Application granted granted Critical
Publication of CN111626208B publication Critical patent/CN111626208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/09Recognition of logos

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

Embodiments of the present disclosure disclose methods and apparatus for detecting small targets. One embodiment of the method comprises the following steps: acquiring an original image including a small target; reducing the original image to a low resolution image; identifying candidate areas comprising small targets from the low-resolution image by adopting a lightweight segmentation network; and taking the region of the original image corresponding to the candidate region as an interest region, running a pre-trained detection model on the interest region, and determining the position of the small target in the original image. According to the embodiment, a two-stage detection method is designed, the region of interest is searched through the lightweight segmentation network, and then the detection model is operated in the region of interest, so that the calculated amount can be greatly saved.

Description

Method and device for detecting small objects
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for detecting small objects.
Background
Target detection is an important research direction in the field of autopilot. The targets that it mainly detects fall into two categories: a stationary object and a moving object. Stationary objects such as traffic lights, traffic signs, lanes, obstacles, etc.; moving objects such as vehicles, pedestrians, non-motor vehicles, etc. The traffic sign detection provides rich and necessary navigation information for the unmanned automobile in the driving process, and is a fundamental work with important significance.
In applications such as AR navigation, the traffic identification of the current road section is detected in real time, and the method has important significance for users to make corresponding prompts. In the vehicle-mounted video, the size distribution range of the traffic sign is wide, a large number of small targets (below 20 pixels) exist, and the detection of the small targets not only tests the detection algorithm, but also requires the image to keep higher resolution, which is a big test on the limited calculation performance of the vehicle.
In order to ensure timeliness of traffic sign recognition, most of the existing schemes adopt a YOLO model to train an input image, and classification of traffic signs is predicted through obtained predicted values, so that recognition is completed. The training network of the YOLO model is a CNN model comprising 7 convolutions of C1-C7 and two full-connection layers, so that the identification can be completed at a high speed, but traffic signs usually only occupy a small part of the acquired original images, and the size of a feature map is continuously reduced when the feature map passes through one convolutions, so that the features of the smaller images are easily lost after the conventional YOLO model method is subjected to multi-layer convolution, and the success rate of traffic sign identification is influenced.
Disclosure of Invention
Embodiments of the present disclosure propose methods and apparatus for detecting small targets.
In a first aspect, embodiments of the present disclosure provide a method for detecting a small target, comprising: acquiring an original image including a small target; reducing the original image to a low resolution image; identifying candidate areas comprising small targets from the low-resolution image by adopting a lightweight segmentation network; and taking the region of the original image corresponding to the candidate region as an interest region, running a pre-trained detection model on the interest region, and determining the position of the small target in the original image.
In some embodiments, the detection model is trained by the following method: determining a network structure of an initial detection model and initializing network parameters of the initial detection model; acquiring a training sample set, wherein the training sample comprises a sample image and labeling information for representing the position of a small target in the sample image; the training samples are enhanced by at least one of: copying, multi-scale changing and editing; respectively taking sample images and labeling information in training samples in the enhanced training sample set as input and expected output of an initial detection model, and training the initial detection model by using a machine learning method; and determining the initial detection model obtained through training as a pre-trained detection model.
In some embodiments, the training samples are compiled by: digging out a small target from the sample image; and randomly pasting the small target to other positions in the sample image after scaling and/or rotating operation to obtain a new sample image.
In some embodiments, the method further comprises: when training samples of the segmentation network are manufactured, setting the pixel points in the rectangular frame originally used for the detection task as positive samples, and setting the pixel points outside the rectangular frame as negative samples; expanding a rectangular frame of a small target with the length and width smaller than the preset pixel number; and setting the pixels in the rectangular frame after the expansion as positive samples.
In some embodiments, the detection model is a deep neural network.
In some embodiments, attention module is introduced after each prediction layer feature fusion, learning an appropriate weight for the features of the different channels.
In a second aspect, embodiments of the present disclosure provide an apparatus for detecting a small target, comprising: an acquisition unit configured to acquire an original image including a small target; a reduction unit configured to reduce an original image to a low resolution image; a first detection unit configured to identify a candidate region including a small target from the low resolution image using a lightweight segmentation network; the second detection unit is configured to take the region of the original image corresponding to the candidate region as an interest region, run a pre-trained detection model on the interest region and determine the position of the small target in the original image.
In some embodiments, embodiments of the present disclosure provide that the apparatus further comprises a training unit configured to: determining a network structure of an initial detection model and initializing network parameters of the initial detection model; acquiring a training sample set, wherein the training sample comprises a sample image and labeling information for representing the position of a small target in the sample image; the training samples are enhanced by at least one of: copying, multi-scale changing and editing; respectively taking sample images and labeling information in training samples in the enhanced training sample set as input and expected output of an initial detection model, and training the initial detection model by using a machine learning device; and determining the initial detection model obtained through training as a pre-trained detection model.
In some embodiments, the training unit is further configured to: digging out a small target from the sample image; and randomly pasting the small target to other positions in the sample image after scaling and/or rotating operation to obtain a new sample image.
In some embodiments, the first detection unit is further configured to: when training samples of the segmentation network are manufactured, setting the pixel points in the rectangular frame originally used for the detection task as positive samples, and setting the pixel points outside the rectangular frame as negative samples; expanding a rectangular frame of a small target with the length and width smaller than the preset pixel number; and setting the pixels in the rectangular frame after the expansion as positive samples.
In some embodiments, the detection model is a deep neural network.
In some embodiments, attention module is introduced after each prediction layer feature fusion, learning an appropriate weight for the features of the different channels.
In a third aspect, embodiments of the present disclosure provide an electronic device for detecting a small target, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method as in any of the first aspects.
In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any of the first aspects.
In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of the first aspects.
The method and the device for detecting the small target are mainly solved from three aspects of a training method, a model structure and two-stage detection, wherein the training method and the model structure are mainly used for improving the detection capability of the model on the small target, and the two-stage detection is used for reducing the calculated amount in a picture irrelevant area so as to improve the operation speed.
The invention can provide a real-time traffic sign detection algorithm for the AR navigation project, has better performance on small target detection, and can promote the navigation experience of users.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
FIG. 2 is a flow chart of one embodiment of a method for detecting small objects according to the present disclosure;
FIG. 3 is a schematic illustration of one application scenario of a method for detecting small objects according to the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of a method for detecting small objects according to the present disclosure;
FIG. 5 is a network block diagram of a detection model for a method of detecting small objects according to the present disclosure;
FIG. 6 is a schematic structural view of one embodiment of an apparatus for detecting small objects according to the present disclosure;
fig. 7 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the present application for detecting small targets or for detecting small target devices may be applied.
As shown in fig. 1, a system architecture 100 may include a vehicle 101 and a traffic sign 102.
The vehicle 101 may be a general motor vehicle or an unmanned vehicle. The vehicle 101 may have a controller 1011, a network 1012, and sensors 1013 installed therein. Network 1012 is the medium used to provide a communication link between controller 1011 and sensor 1013. Network 1012 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A controller (also known as an onboard brain) 1011 is responsible for intelligent control of the vehicle 101. The controller 1011 may be a separately provided controller such as a programmable logic controller (Programmable Logic Controller, PLC), a single chip microcomputer, an industrial controller, or the like; the device can also be equipment consisting of other electronic devices with input/output ports and operation control functions; but also a computer device installed with a vehicle driving control type application. The controller is provided with a trained segmentation network and a detection model.
The sensor 1013 may be various types of sensors such as a video camera, a gravity sensor, a wheel speed sensor, a temperature sensor, a humidity sensor, a laser radar, a millimeter wave radar, and the like. In some cases, a GNSS (Global Navigation Satellite System ) device and an SINS (Strap-down Inertial Navigation System, strapdown inertial navigation System) may also be installed in the vehicle 101, and so on.
The vehicle 101 captures a traffic sign 102 during travel. The traffic sign in the image is a small target, whether the image is taken farther or closer.
The vehicle 101 delivers the captured original image including the traffic sign to the controller for recognition, and determines the location of the traffic sign. OCR recognition can also be performed to identify the content of the traffic sign. And then outputting the content of the traffic sign in the form of voice or text.
It should be noted that, the method for detecting a small target provided in the embodiment of the present application is generally performed by the controller 1011, and accordingly, the device for detecting a small target is generally disposed in the controller 1011.
It should be understood that the number of controllers, networks, and sensors in fig. 1 are merely illustrative. There may be any number of controllers, networks, and sensors, as desired for implementation.
With continued reference to fig. 2, a flow 200 of one embodiment of a method for detecting small objects according to the present disclosure is shown. The method for detecting a small target comprises the following steps:
in step 201, an original image including a small object is acquired.
In this embodiment, an execution subject (e.g., a controller shown in fig. 1) of the method for detecting a small target may acquire a front image through an in-vehicle camera, and the acquired original image includes the small target. A small target refers to an image of a target object having a number of long and wide pixels less than a predetermined value (e.g., 20).
Step 202, reducing the original image into a low resolution image.
In this embodiment, the original image may be divided by 4 (or other multiple) in each of the length-width directions to obtain a low resolution image. The aspect ratio is kept unchanged during the shrinking process.
In step 203, a lightweight segmentation network is used to identify candidate regions from the low resolution image that include small objects.
In this embodiment, since the first stage detection only needs to locate the approximate position where the target may exist, and does not need an accurate outer frame, the first stage detection is implemented by using a lightweight segmentation network, and a point in the final output thermodynamic diagram of the first stage detection, which is greater than a certain threshold, is regarded as a point where the target is suspected to exist. A split network like U-Net may be used, and a backbone network is used for a shufflelet for light weight.
When training samples of the segmentation network are manufactured, the pixels in the rectangular frame originally used for the detection task are set as positive samples, and the pixels outside the rectangular frame are set as negative samples. Because of the scaling in the length-width direction, to ensure recall on small targets, when training samples are made, rectangular frames of targets with length-width smaller than a predetermined value, e.g., 20 pixels, are expanded by one time, and then pixels in the expanded rectangular frames are set as positive samples.
And 204, taking the region of the original image corresponding to the candidate region as an interest region, running a pre-trained detection model on the interest region, and determining the position of the small target in the original image.
In this embodiment, after filtering out noise points in the result output by the segmentation network, a minimum circumscribed rectangle surrounding all suspected target points is formed, and a region corresponding to the rectangle in the non-zoomed high-resolution image is used as the region of interest. The detection model is then run over the region of interest, so that only a portion of the area of the high resolution picture needs to be processed, thereby reducing the computational effort.
As described above, in order to better detect small objects, it is necessary to keep the resolution of the picture high, and a large picture can cause a multiple increase in the calculation amount, which is difficult to process in real time in the vehicle environment. On the other hand, traffic signs occupy a small proportion of the picture, most of which are background areas, and the amount of computation in the background areas occupies a large proportion of the total amount, and processing the background areas at high resolution is time-consuming and meaningless. Therefore, the invention adopts a two-stage detection mode, the approximate position of the suspected target is positioned on the low-resolution picture through a lightweight segmentation network, then the minimum circumscribed rectangle containing all the suspected targets is obtained, and finally a detection model is operated on the high-resolution image block corresponding to the minimum circumscribed rectangle, so that the calculated amount is reduced under the condition of ensuring the detection rate of the small target.
After the two-stage processing, the average calculated amount of the detection model is reduced to about 25% of the original calculated amount, and the average calculated amount of the two models added up is about 45% of the original calculated amount.
With continued reference to fig. 4, fig. 4 is a schematic diagram of an application scenario of the method for detecting a small target according to the present embodiment. In the application scenario of fig. 4, the vehicle acquires the front image in real time during driving. Dividing the acquired original image length and width by 4, and then shrinking the original image length and width into a low-resolution image. The low resolution image is input into a lightweight segmented network, identifying candidate regions that include traffic identifications. And then finding out the region of the original image corresponding to the candidate region in the original image as the region of interest. The image of the interest area is scratched out, a pre-trained detection model is input, and the specific position of the traffic sign in the original image is determined, as shown by a dotted line box.
The method provided by the embodiment of the disclosure reduces the calculated amount and improves the recognition speed and accuracy through secondary detection.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for detecting small objects is shown. The process 400 of the method for detecting small objects comprises the steps of:
step 401, determining a network structure of an initial detection model and initializing network parameters of the initial detection model.
In this embodiment, the electronic device (e.g., the controller shown in FIG. 1) on which the method for detecting small objects operates may train out a detection model. The detection model may also be trained by a third party server and then installed into the controller of the vehicle. The detection model is a neural network model, and can be any existing neural network for target detection.
In some alternative implementations of the present embodiment, the detection model is a deep neural network, such as a YOLO series network. YOLO (You Only Look Once) is an object recognition and positioning algorithm based on a deep neural network, and has the biggest characteristic of high running speed, and can be used for a real-time system. YOLO has now evolved to version v3 (YOLO 3), but the new version is evolving with continued improvements over the original version. In the original structural design of YOLO3, the low-resolution feature map is fused with the high-resolution feature map by upsampling. However, such fusion occurs only on the high-resolution feature map, and the fusion of features of different scales is not sufficient.
In order to better fuse the features of different layers, the invention firstly selects the features of 8 times, 16 times and 32 times of downsampling in a backbone network as basic features, then sets the sizes of the prediction feature images to be 8 times, 16 times and 32 times of downsampling of pictures respectively in order to predict targets with different sizes, and fuses the features of each prediction feature image from 3 basic feature layers after the downsampling or upsampling is unified to the same size. Taking a picture downsampling 16 times of prediction layer as an example, the features of the picture are respectively from 3 base feature layers, so that the picture is unified to the same size, downsampling 8 times of base feature layers is performed one time, downsampling 32 times of base feature layers is performed one time, and then the two feature layers are fused with the downsampling 16 times of base feature layers.
If the features of different scales are simply fused, the specific gravity of the features in the 3 prediction layers is the same, and the features cannot be used with emphasis according to different prediction targets. Therefore, after the characteristics of each prediction layer are fused, a attention module is introduced to learn a proper weight for the characteristics of different channels, so that each prediction layer can use the fused characteristics with emphasis according to the characteristics of the prediction targets required by the prediction layer. The network structure is shown in fig. 5. The learning manner of the parameters of the attention module is the prior art, so that no further description is given.
The present disclosure may use YOLO3 as a detection network, and design and assignment of anchors in such an anchor-based (anchor) detection method are very important, and since the number of anchors that can be matched with a small target is small, this may directly lead to insufficient learning of the model on the small target, so that the small target cannot be detected well. A dynamic anchor matching mechanism is adopted, an IOU (confidence score) threshold value is selected adaptively according to the size of a group trunk and matched with the group trunk, and when the targets are smaller, the IOU threshold value is reduced, so that more small targets can participate in training, and the performance of the model on small target detection is improved. When training samples are made, the size of the target is known, and then the appropriate IOU threshold is selected based on the target size.
Step 402, a training sample set is obtained.
In this embodiment, the training sample includes a sample image and annotation information for characterizing the location of small objects in the sample image.
Step 403, enhancing the training sample by at least one of the following ways: replication, multi-scale change, editing.
In this embodiment, this is mainly for strategies that the small number of targets in the training data is not sufficient. On one hand, the number of small targets in the data is directly increased by copying pictures containing the small targets in the multiple data sets; on the other hand, after the small targets in the pictures are picked out to perform operations such as zooming and rotating, the small targets are randomly stuck to other positions of the images, so that the number of the small targets can be increased, more changes can be introduced, and the distribution of training data is enriched.
Optionally, training is performed by scaling the training pictures to different scales, so that the target scale change in the original data set can be enriched, and the model can adapt to detection tasks of targets with different scales.
And step 404, respectively taking sample images and labeling information in the training samples in the enhanced training sample set as input and expected output of an initial detection model, and training the initial detection model by using a machine learning method.
In this embodiment, the execution subject may input a sample image in a training sample in the training sample set into the initial detection model to obtain the position information of the small target in the sample image, and use the labeling information in the training sample as the expected output of the initial detection model, and train the initial detection model by using the machine learning method. Specifically, the difference between the obtained position information and the labeling information in the training sample may be calculated first using a preset loss function, for example, the difference between the obtained position information and the labeling information in the training sample may be calculated using an L2 norm as the loss function. Then, based on the calculated difference, the network parameters of the initial detection model may be adjusted, and the training may be ended if a preset training end condition is satisfied. For example, the training end conditions preset herein may include, but are not limited to, at least one of: the training time exceeds the preset duration; the training times exceed the preset times; the calculated variance is less than a preset variance threshold.
Here, various implementations may be employed to adjust network parameters of the initial detection model based on differences between the generated location information and the annotation information in the training sample. For example, a BP (Back Propagation) algorithm or an SGD (Stochastic Gradient Descent, random gradient descent) algorithm may be employed to adjust network parameters of the initial detection model.
Step 405, determining the initial detection model obtained by training as a pre-trained detection model.
In this embodiment, the execution subject of the training step may determine the initial detection model trained in step 404 as a pre-trained detection model.
With further reference to fig. 6, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an apparatus for detecting small objects, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 6, the apparatus 600 for detecting a small target of the present embodiment includes: an acquisition unit 601, a reduction unit 602, a first detection unit 603, and a second detection unit 604. Wherein the acquisition unit 601 is configured to acquire an original image including a small target; a reduction unit 602 configured to reduce an original image to a low resolution image; a first detection unit 603 configured to identify a candidate region including a small target from the low resolution image using a lightweight segmentation network; the second detection unit 604 is configured to determine the position of the small target in the original image by using the region of the original image corresponding to the candidate region as the region of interest and running a pre-trained detection model on the region of interest.
In this embodiment, specific processes of the acquiring unit 601, the reducing unit 602, the first detecting unit 603, and the second detecting unit 604 of the apparatus 600 for detecting a small target may refer to steps 201, 202, 203, and 204 in the corresponding embodiment of fig. 2.
In some optional implementations of the present embodiment, the apparatus 600 further includes a training unit (not shown in the drawings) configured to: determining a network structure of an initial detection model and initializing network parameters of the initial detection model; acquiring a training sample set, wherein the training sample comprises a sample image and labeling information for representing the position of a small target in the sample image; the training samples are enhanced by at least one of: copying, multi-scale changing and editing; respectively taking sample images and labeling information in training samples in the enhanced training sample set as input and expected output of an initial detection model, and training the initial detection model by using a machine learning device; and determining the initial detection model obtained through training as a pre-trained detection model.
In some optional implementations of this embodiment, the training unit is further configured to: digging out a small target from the sample image; and randomly pasting the small target to other positions in the sample image after scaling and/or rotating operation to obtain a new sample image.
In some optional implementations of this embodiment, the first detection unit is further configured to: when training samples of the segmentation network are manufactured, setting the pixel points in the rectangular frame originally used for the detection task as positive samples, and setting the pixel points outside the rectangular frame as negative samples; expanding a rectangular frame of a small target with the length and width smaller than the preset pixel number; and setting the pixels in the rectangular frame after the expansion as positive samples.
In some alternative implementations of the present embodiment, the detection model is a deep neural network.
In some alternative implementations of this embodiment, attention modules are introduced after each prediction layer feature fusion, learning a suitable weight for the features of the different channels.
Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the controller of fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The controller illustrated in fig. 7 is merely an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present disclosure.
As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 7 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 708, or installed from ROM 702. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 701. It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an original image including a small target; reducing the original image to a low resolution image; identifying candidate areas comprising small targets from the low-resolution image by adopting a lightweight segmentation network; and taking the region of the original image corresponding to the candidate region as an interest region, running a pre-trained detection model on the interest region, and determining the position of the small target in the original image.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a reduction unit, a first detection unit, and a second detection unit. The names of these units do not constitute a limitation on the unit itself in some cases, and the acquisition unit may also be described as "a unit that receives a web browsing request of a user", for example.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which any combination of features described above or their equivalents is contemplated without departing from the inventive concepts described. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (14)

1. A method for detecting a small target, comprising:
acquiring an original image comprising a small target through a vehicle-mounted camera;
reducing the original image to a low resolution image;
identifying a candidate region comprising the small target from the low-resolution image by adopting a lightweight segmentation network, wherein a backbone network of the segmentation network adopts a shuffle;
and filtering noise points of the candidate areas to form a minimum circumscribed rectangle surrounding all the rest suspected target points, taking a corresponding area of the rectangle in the unscaled high-resolution original image as an interest area, running a pre-trained detection model on the interest area, and determining the position of the small target in the original image.
2. The method of claim 1, wherein the detection model is trained by:
determining a network structure of an initial detection model and initializing network parameters of the initial detection model;
acquiring a training sample set, wherein the training sample comprises a sample image and labeling information for representing the position of a small target in the sample image;
enhancing the training sample by at least one of: copying, multi-scale changing and editing;
respectively taking sample images and labeling information in training samples in the enhanced training sample set as input and expected output of the initial detection model, and training the initial detection model by using a machine learning method;
and determining the initial detection model obtained through training as the pre-trained detection model.
3. The method of claim 2, wherein the training samples are compiled by:
digging out a small target from the sample image;
and randomly pasting the small target to other positions in the sample image after scaling and/or rotating operation to obtain a new sample image.
4. The method of claim 1, wherein the method further comprises:
when a training sample of the segmentation network is manufactured, setting the pixel points in a rectangular frame originally used for a detection task as positive samples, and setting the pixel points outside the rectangular frame as negative samples;
expanding a rectangular frame of a small target with the length and width smaller than the preset pixel number;
and setting the pixels in the rectangular frame after the expansion as positive samples.
5. A method according to any of claims 1-3, wherein the detection model is a deep neural network.
6. The method of claim 5, wherein attention module is directed after each prediction layer feature fusion, learning an appropriate weight for the features of the different channels.
7. An apparatus for detecting a small target, comprising:
an acquisition unit configured to acquire an original image including a small target through a vehicle-mounted camera;
a reduction unit configured to reduce the original image to a low resolution image;
a first detection unit configured to identify a candidate region including the small object from the low resolution image using a lightweight split network, wherein a backbone network of the split network uses a shufflelet;
and the second detection unit is configured to filter noise points of the candidate areas to form a minimum circumscribed rectangle surrounding all suspected target points, take a corresponding area of the rectangle in the unscaled high-resolution original image as an interest area, run a pre-trained detection model on the interest area and determine the position of the small target in the original image.
8. The apparatus of claim 7, wherein the apparatus further comprises a training unit configured to:
determining a network structure of an initial detection model and initializing network parameters of the initial detection model;
acquiring a training sample set, wherein the training sample comprises a sample image and labeling information for representing the position of a small target in the sample image;
enhancing the training sample by at least one of: copying, multi-scale changing and editing;
respectively taking sample images and labeling information in training samples in the enhanced training sample set as input and expected output of the initial detection model, and training the initial detection model by using a machine learning device;
and determining the initial detection model obtained through training as the pre-trained detection model.
9. The apparatus of claim 8, wherein the training unit is further configured to:
digging out a small target from the sample image;
and randomly pasting the small target to other positions in the sample image after scaling and/or rotating operation to obtain a new sample image.
10. The apparatus of claim 7, wherein the first detection unit is further configured to:
when a training sample of the segmentation network is manufactured, setting the pixel points in a rectangular frame originally used for a detection task as positive samples, and setting the pixel points outside the rectangular frame as negative samples;
expanding a rectangular frame of a small target with the length and width smaller than the preset pixel number;
and setting the pixels in the rectangular frame after the expansion as positive samples.
11. The apparatus of one of claims 7-10, wherein the detection model is a deep neural network.
12. The apparatus of claim 11, wherein the attention module is directed after each prediction layer feature fusion to learn an appropriate weight for the features of the different channels.
13. An electronic device for detecting small objects, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
14. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-6.
CN202010461384.2A 2020-05-27 2020-05-27 Method and device for detecting small objects Active CN111626208B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202010461384.2A CN111626208B (en) 2020-05-27 2020-05-27 Method and device for detecting small objects
JP2021051677A JP7262503B2 (en) 2020-05-27 2021-03-25 Method and apparatus, electronic device, computer readable storage medium and computer program for detecting small targets
KR1020210040639A KR102523886B1 (en) 2020-05-27 2021-03-29 A method and a device for detecting small target

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010461384.2A CN111626208B (en) 2020-05-27 2020-05-27 Method and device for detecting small objects

Publications (2)

Publication Number Publication Date
CN111626208A CN111626208A (en) 2020-09-04
CN111626208B true CN111626208B (en) 2023-06-13

Family

ID=72272663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010461384.2A Active CN111626208B (en) 2020-05-27 2020-05-27 Method and device for detecting small objects

Country Status (3)

Country Link
JP (1) JP7262503B2 (en)
KR (1) KR102523886B1 (en)
CN (1) CN111626208B (en)

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418345B (en) * 2020-12-07 2024-02-23 深圳小阳软件有限公司 Method and device for quickly identifying small targets with fine granularity
CN112633218B (en) * 2020-12-30 2023-10-13 深圳市优必选科技股份有限公司 Face detection method, face detection device, terminal equipment and computer readable storage medium
CN112801169B (en) * 2021-01-25 2024-02-06 中国人民解放军陆军工程大学 Camouflage target detection method, system, device and storage medium based on improved YOLO algorithm
CN113158743B (en) * 2021-01-29 2022-07-12 中国科学院自动化研究所 Small target real-time detection and positioning method, system and equipment based on priori knowledge
CN113011297B (en) * 2021-03-09 2024-07-19 全球能源互联网研究院有限公司 Power equipment detection method, device, equipment and server based on edge cloud cooperation
CN113223026A (en) * 2021-04-14 2021-08-06 山东师范大学 Contour-based target fruit image example segmentation method and system
CN113095434B (en) * 2021-04-27 2024-06-11 深圳市商汤科技有限公司 Target detection method and device, electronic equipment and storage medium
CN113139483B (en) * 2021-04-28 2023-09-29 北京百度网讯科技有限公司 Human behavior recognition method, device, apparatus, storage medium, and program product
CN113295298B (en) * 2021-05-19 2025-06-06 深圳市朗驰欣创科技股份有限公司 Temperature measurement method, temperature measurement device, terminal equipment and storage medium
CN113221823B (en) * 2021-05-31 2024-06-07 南通大学 Traffic signal lamp countdown identification method based on improved lightweight YOLOv3
CN113221925B (en) * 2021-06-18 2022-11-11 北京理工大学 Target detection method and device based on multi-scale image
CN113591569A (en) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 Obstacle detection method, obstacle detection device, electronic apparatus, and storage medium
CN113360791B (en) * 2021-06-29 2023-07-18 北京百度网讯科技有限公司 Interest point query method and device of electronic map, road side equipment and vehicle
CN113553979B (en) * 2021-07-30 2023-08-08 国电汉川发电有限公司 A safety clothing detection method and system based on improved YOLO V5
CN113673604B (en) * 2021-08-23 2025-02-28 浙江大华技术股份有限公司 Target detection method and device, storage medium and electronic device
CN113628208B (en) * 2021-08-30 2024-02-06 北京中星天视科技有限公司 Ship detection method, device, electronic equipment and computer readable medium
KR102660084B1 (en) * 2021-09-30 2024-04-22 연세대학교 산학협력단 Apparatus and Method for Detecting 3D Object
CN113989592A (en) * 2021-10-28 2022-01-28 三一建筑机器人(西安)研究院有限公司 A method, device and electronic device for extending semantically segmented image samples
CN114155466B (en) * 2021-11-30 2024-08-13 云控智行科技有限公司 Target recognition method and device based on deep learning
CN114241345B (en) * 2021-12-21 2025-04-22 中国农业科学院农业信息研究所 Method and device for detecting target in image
CN114387225B (en) * 2021-12-23 2024-12-10 沈阳东软智能医疗科技研究院有限公司 Bone joint image recognition method, device, electronic device and readable medium
CN114298952A (en) * 2021-12-29 2022-04-08 深存科技(无锡)有限公司 Label image generation method, device, equipment and storage medium
CN115294380B (en) * 2022-01-05 2025-03-25 邵阳学院 A dynamic training method for deep learning object detection
CN114387581B (en) * 2022-01-12 2024-10-18 广州图元跃迁电子科技有限公司 Vehicle surrounding identification recognition method, device, storage medium and computer equipment
CN114973306A (en) * 2022-01-21 2022-08-30 昆明理工大学 Fine-scale embedded lightweight infrared real-time detection method and system
WO2023153781A1 (en) * 2022-02-08 2023-08-17 Samsung Electronics Co., Ltd. Method and electronic device for processing input frame for on-device ai model
CN114612739A (en) * 2022-02-24 2022-06-10 江西裕丰智能农业科技有限公司 Binocular panoramic image target detection method and device and computer equipment
CN114565942A (en) * 2022-02-26 2022-05-31 南京理工大学 Live pig face detection method based on compressed YOLOv5
CN114463854A (en) * 2022-03-04 2022-05-10 河北工程大学 Device and method for gesture recognition switch based on deep learning
CN114581523B (en) * 2022-03-04 2025-04-15 京东鲲鹏(江苏)科技有限公司 A method and device for determining annotation data for monocular 3D target detection
CN114595759B (en) * 2022-03-07 2024-12-20 卡奥斯工业智能研究院(青岛)有限公司 Protective gear identification method, device, electronic device and storage medium
CN114298912B (en) * 2022-03-08 2022-10-14 北京万里红科技有限公司 Image acquisition method and device, electronic equipment and storage medium
CN114863384A (en) * 2022-03-22 2022-08-05 上海电力大学 A traffic sign detection method based on YOLO v4 algorithm
CN115131281A (en) * 2022-04-01 2022-09-30 腾讯科技(深圳)有限公司 Change detection model training and image change detection method, device and device
CN114926704B (en) * 2022-04-26 2025-05-23 南京信息工程大学 Target detection method based on deep learning
CN114821269B (en) * 2022-05-10 2024-11-26 安徽蔚来智驾科技有限公司 Multi-task target detection method, device, autonomous driving system and storage medium
CN114973288B (en) * 2022-05-30 2024-08-30 成都人人互娱科技有限公司 Non-commodity graph text detection method, system and computer storage medium
CN115493583A (en) * 2022-07-06 2022-12-20 北京航空航天大学 Astronomical target detection and accurate positioning integrated method
CN115063660A (en) * 2022-07-19 2022-09-16 北京京东乾石科技有限公司 Image detection method and device
CN117541771A (en) * 2022-08-01 2024-02-09 马上消费金融股份有限公司 Image recognition model training method and image recognition method
CN115294065A (en) * 2022-08-08 2022-11-04 山东大学 FPC-BTB interface detection positioning method and system based on tph-yolov5 deep learning
CN115620157B (en) * 2022-09-21 2024-07-09 清华大学 Method and device for learning characterization of satellite image
CN115546500B (en) * 2022-10-31 2025-07-18 西安交通大学 Infrared image small target detection method
CN115731243B (en) * 2022-11-29 2024-02-09 北京长木谷医疗科技股份有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism
CN115984084B (en) * 2022-12-19 2023-06-06 中国科学院空天信息创新研究院 A Remote Sensing Distributed Data Processing Method Based on Dynamic Separable Network
CN118279896A (en) * 2022-12-29 2024-07-02 北京图森智途科技有限公司 Three-dimensional object detection method, apparatus and computer-readable storage medium
CN117173423B (en) * 2023-08-09 2024-07-23 山东财经大学 Method, system, equipment and medium for detecting small image target
CN117078980A (en) * 2023-08-28 2023-11-17 西安工业大学 Deep learning method for positioning infrared weak and small targets
CN116912604B (en) * 2023-09-12 2024-01-16 浙江大华技术股份有限公司 Model training method, image recognition device and computer storage medium
CN117218505A (en) * 2023-09-25 2023-12-12 佳源科技股份有限公司 Substation state indicator lamp identification method based on deep learning
CN117671458B (en) * 2023-12-20 2024-06-14 云南神火铝业有限公司 Construction method and application of block anode scrap detection model capable of automatically identifying block anode scrap
CN117746191B (en) * 2024-02-07 2024-05-10 浙江啄云智能科技有限公司 Graph searching model training method and graph searching method
CN117746028B (en) * 2024-02-08 2024-06-11 暗物智能科技(广州)有限公司 Visual detection method, device, equipment and medium for unlabeled articles
CN118504650B (en) * 2024-05-10 2025-02-28 广东电网有限责任公司 Ice detection model training method, device, electronic device and storage medium
CN118172547B (en) * 2024-05-16 2024-07-30 北京航空航天大学杭州创新研究院 Image target recognition method, device, electronic device and computer readable medium
CN118365990B (en) * 2024-06-19 2024-08-30 浙江啄云智能科技有限公司 Model training method and device applied to contraband detection and electronic equipment
CN118657927B (en) * 2024-07-08 2024-11-29 北京鼎星科技有限公司 Improved YOLOv n small target detection method based on feature fusion
CN118674723B (en) * 2024-08-23 2024-11-15 南京华视智能科技股份有限公司 Method for detecting virtual edges of coated ceramic area based on deep learning
CN118781499B (en) * 2024-09-10 2024-12-24 中国科学院自动化研究所 On-orbit real-time detection and identification method and device
CN119169081B (en) * 2024-11-18 2025-03-07 之江实验室 Geomagnetic adaptive area positioning method and device for geomagnetic navigation
CN119323743B (en) * 2024-12-19 2025-03-21 北京中科思创云智能科技有限公司 Visible light and thermal infrared image target fusion detection method from the perspective of UAV vision

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598912A (en) * 2015-01-23 2015-05-06 湖南科技大学 Traffic light detection and recognition method based CPU and GPU cooperative computing
CN108229575A (en) * 2018-01-19 2018-06-29 百度在线网络技术(北京)有限公司 For detecting the method and apparatus of target
CN109829456A (en) * 2017-11-23 2019-05-31 腾讯科技(深圳)有限公司 Image-recognizing method, device and terminal
CN109858472A (en) * 2019-04-09 2019-06-07 武汉领普科技有限公司 A kind of embedded humanoid detection method and device in real time
WO2020020472A1 (en) * 2018-07-24 2020-01-30 Fundación Centro Tecnoloxico De Telecomunicacións De Galicia A computer-implemented method and system for detecting small objects on an image using convolutional neural networks
CN110909756A (en) * 2018-09-18 2020-03-24 苏宁 Convolutional neural network model training method and device for medical image recognition

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4420459B2 (en) * 2005-06-14 2010-02-24 キヤノン株式会社 Image processing apparatus and method
US10740607B2 (en) * 2017-08-18 2020-08-11 Autel Robotics Co., Ltd. Method for determining target through intelligent following of unmanned aerial vehicle, unmanned aerial vehicle and remote control
US10973486B2 (en) 2018-01-08 2021-04-13 Progenics Pharmaceuticals, Inc. Systems and methods for rapid neural network-based image segmentation and radiopharmaceutical uptake determination
CN110119734A (en) * 2018-02-06 2019-08-13 同方威视技术股份有限公司 Cutter detecting method and device
US10936905B2 (en) 2018-07-06 2021-03-02 Tata Consultancy Services Limited Method and system for automatic object annotation using deep network
CN109344821A (en) * 2018-08-30 2019-02-15 西安电子科技大学 Small target detection method based on feature fusion and deep learning
CN110298226B (en) * 2019-04-03 2023-01-06 复旦大学 Cascading detection method for millimeter wave image human body carried object
CN110503112B (en) * 2019-08-27 2023-02-03 电子科技大学 A Small Target Detection and Recognition Method Based on Enhanced Feature Learning
CN110866925B (en) * 2019-10-18 2023-05-26 拜耳股份有限公司 Method and device for image segmentation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598912A (en) * 2015-01-23 2015-05-06 湖南科技大学 Traffic light detection and recognition method based CPU and GPU cooperative computing
CN109829456A (en) * 2017-11-23 2019-05-31 腾讯科技(深圳)有限公司 Image-recognizing method, device and terminal
CN108229575A (en) * 2018-01-19 2018-06-29 百度在线网络技术(北京)有限公司 For detecting the method and apparatus of target
WO2020020472A1 (en) * 2018-07-24 2020-01-30 Fundación Centro Tecnoloxico De Telecomunicacións De Galicia A computer-implemented method and system for detecting small objects on an image using convolutional neural networks
CN110909756A (en) * 2018-09-18 2020-03-24 苏宁 Convolutional neural network model training method and device for medical image recognition
CN109858472A (en) * 2019-04-09 2019-06-07 武汉领普科技有限公司 A kind of embedded humanoid detection method and device in real time

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度学习的织物瑕疵小目标检测技术研究;赵亚男;中国优秀硕士学位论文全文数据库工程科技Ⅰ辑(第2期);B024-95 *
结合目标检测的小目标语义分割算法;胡太等;南京大学学报(自然科学);第55卷(第1期);73-84 *

Also Published As

Publication number Publication date
KR102523886B1 (en) 2023-04-21
JP2021179971A (en) 2021-11-18
KR20210042275A (en) 2021-04-19
JP7262503B2 (en) 2023-04-21
CN111626208A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN111626208B (en) Method and device for detecting small objects
CN111931929B (en) Training method and device for multitasking model and storage medium
CN111582189B (en) Traffic signal lamp identification method and device, vehicle-mounted control terminal and motor vehicle
CN111401255B (en) Method and device for identifying bifurcation junctions
CN112307978B (en) Target detection method and device, electronic equipment and readable storage medium
CN111310770B (en) Target detection method and device
CN111860227A (en) Method, apparatus, and computer storage medium for training trajectory planning model
CN113963238A (en) Construction method of multitask perception recognition model and multitask perception recognition method
CN113409393B (en) Method and device for identifying traffic sign
CN119251785A (en) Target detection method, device, equipment and storage medium
CN111340880B (en) Method and apparatus for generating predictive model
CN115512336B (en) Vehicle positioning method and device based on street lamp light source and electronic equipment
CN108960160B (en) Method and device for predicting structured state quantity based on unstructured prediction model
CN112215042A (en) A parking space limiter identification method, system and computer equipment
CN116012814A (en) Signal lamp identification method, signal lamp identification device, electronic equipment and computer readable storage medium
CN115424068A (en) Pedestrian intention prediction method, system, electronic device and readable storage medium
CN113902047A (en) Image element matching method, device, equipment and storage medium
US12147232B2 (en) Method, system and computer program product for the automated locating of a vehicle
CN110807397A (en) Method and device for predicting motion state of target object
CN114612353B (en) Image processing method and device
CN115019278B (en) Lane line fitting method and device, electronic equipment and medium
US20240176008A1 (en) Enhanced Tracking and Speed Detection
CN119540899A (en) Multi-target detection method and model training method, edge computing device and medium
CN117058195A (en) Target detection method, device, equipment and storage medium
HK40038752A (en) Target detecting method and apparatus, electronic device, and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211009

Address after: 100176 101, floor 1, building 1, yard 7, Ruihe West 2nd Road, economic and Technological Development Zone, Daxing District, Beijing

Applicant after: Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd.

Address before: 2 / F, baidu building, No. 10, Shangdi 10th Street, Haidian District, Beijing 100085

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant