[go: up one dir, main page]

CN113065558A - Lightweight small target detection method combined with attention mechanism - Google Patents

Lightweight small target detection method combined with attention mechanism Download PDF

Info

Publication number
CN113065558A
CN113065558A CN202110432768.6A CN202110432768A CN113065558A CN 113065558 A CN113065558 A CN 113065558A CN 202110432768 A CN202110432768 A CN 202110432768A CN 113065558 A CN113065558 A CN 113065558A
Authority
CN
China
Prior art keywords
network
small target
target detection
module
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110432768.6A
Other languages
Chinese (zh)
Other versions
CN113065558B (en
Inventor
朱威
王立凯
靳作宝
何德峰
郑雅羽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202110432768.6A priority Critical patent/CN113065558B/en
Publication of CN113065558A publication Critical patent/CN113065558A/en
Application granted granted Critical
Publication of CN113065558B publication Critical patent/CN113065558B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种结合注意力机制的轻量级小目标检测方法,包括以下步骤:(1)搭建基于YOLOv4的小目标检测网络:构建MSE多尺度注意力模块插入到特征提取网络,同时添加浅层特征图作为预测层,以及SPP模块的改进,增强特征的提取能力;(2)构建小目标数据集,使用数据增强策略对训练集数据进行增强,对锚框进行自定义(3)对模型进行通道剪枝,同时采用知识蒸馏恢复模型精度;(4)输入一幅无人机航拍图像,获取目标分类和定位结果。本发明利用通道注意力机制和模型压缩策略,能够在有效改善小目标错检漏检现象的同时保证模型的实时性。

Figure 202110432768

The invention relates to a lightweight small target detection method combined with an attention mechanism, comprising the following steps: (1) building a small target detection network based on YOLOv4: building an MSE multi-scale attention module and inserting it into the feature extraction network, adding shallow The layer feature map is used as the prediction layer, and the improvement of the SPP module enhances the feature extraction ability; (2) Build a small target data set, use the data enhancement strategy to enhance the training set data, and customize the anchor box (3) To model Perform channel pruning, and use knowledge distillation to restore the model accuracy; (4) Input a UAV aerial image to obtain target classification and positioning results. The present invention utilizes the channel attention mechanism and the model compression strategy, which can effectively improve the phenomenon of false detection and missed detection of small targets while ensuring the real-time performance of the model.

Figure 202110432768

Description

Lightweight small target detection method combined with attention mechanism
Technical Field
The invention belongs to the application of a deep learning technology in the field of machine vision, and particularly relates to a lightweight small target detection method combined with an attention mechanism.
Background
The target detection finds out the specific target category and the accurate position thereof in a given image, wherein the small target detection is an important research content in the field of target detection, and has important application value in scenes such as remote sensing image target identification, infrared imaging target identification, agricultural pest identification and the like. In the target detection, a target having a target pixel value of 0.12% or less of the entire image or having a pixel value of less than 32 × 32 is generally referred to as a small target. Detecting small objects in an image is very difficult because of the low resolution and noise of small objects, often the insignificant features extracted after multi-layer convolution.
Early detection of small targets was mainly achieved by manually designed methods to obtain characteristic information of the target. Wen pei zhi et al apply wavelet transform to small target detection process (see wen pei zhi, smilin, hai and wu dawn. sea background infrared small target detection method [ J ]. photoelectric engineering, 2004) utilize multi-resolution analysis of orthogonal wavelet decomposition to achieve band selection, suppress interference of noise and background, and utilize edges in different directions to fuse to obtain candidate points, and finally eliminate the interference target according to gray threshold. CHEN et al (see C.L.P.Chen, H.Li, Y.Wei, et al.A. Local Contrast Method for Small extracted Target Detection [ J ]// IEEE Transactions on Geoscience and remove Sensing,2014,52(1):574 581) are motivated by biological visual mechanisms, use a proposed Local Contrast metric to obtain a Local Contrast map of the input image, which can represent the difference between the current location and its neighborhood, thus achieving both Target signal enhancement and background clutter suppression, and finally segmenting the Target by adaptive thresholds. The method starts from the bottom layer characteristics of the image, uses the basic image characteristics to realize the detection task, has simpler operation, but also has the problems of missed detection and error detection and real-time performance for the detection of small targets with complex backgrounds.
In recent years, with the improvement of computer power and the rapid development of deep learning theory, deep learning techniques have been widely used for target detection. Currently popular target detection models can be roughly divided into two categories: a stage detection algorithm, classification and positioning are regarded as regression tasks, and representative algorithms are SSD and YOLO; and (3) a two-stage detection algorithm, namely selecting a candidate box and separating a target classification, wherein the representative algorithms comprise R-CNN and Faster R-CNN. The one-stage detection algorithm has great advantages in real-time performance because the whole detection task is regarded as regression operation.
The main ways of improving the small target detection by using the deep learning technology are multi-scale representation, context information, super-resolution and the like. Patent application No. CN202010537199.7 discloses a detection method for small objects in pictures. Acquiring six feature graphs with different sizes from a picture to be detected, performing feature fusion on a pyramid bottom layer feature graph and a pyramid high layer feature graph in the six feature graphs with different sizes by adopting a bilinear interpolation method to obtain new six feature graphs with different sizes, and using the new six feature graphs with different sizes to participate in prediction. The method adopts the multi-scale characteristic diagram to enhance the target characteristic information, but is easily interfered by a complex background, and the false detection rate is higher. The patent with the application number of CN202010444356.X discloses a remote sensing image small target detection method based on resolution enhancement, which carries out super-resolution processing on a remote sensing image containing a small target and then carries out target detection, solves the problems that the small target in the remote sensing image has less available characteristic information and the small target area has geometric deformation, further perfects the detailed characteristic information of the small target by adopting a super-resolution processing technology, fully utilizes the limited characteristic information of the small target by applying a deformable convolution network based on an area, and improves the detection capability of the remote sensing image on the small target. Although the method has better accuracy, the real-time performance of the network is reduced due to the increase of the resolution of the picture, and the method is not beneficial to the light weight of the network.
Disclosure of Invention
In order to solve the problems of high false detection rate, missing detection, poor real-time performance and the like of the existing target detection method for detecting the small target, the invention provides a lightweight small target detection method combined with an attention mechanism, which comprises the following steps:
(1) building improved small target detection network based on YOLOv4
The small target detection network is obtained by improving a one-stage target detection network YOLOv4, and the specific network structure improvement comprises the following three aspects:
(1-1) constructing an MSE multi-scale attention mechanism module, and inserting the MSE multi-scale attention mechanism module into a feature extraction network
The MSE multi-scale attention mechanism module constructed by the invention is obtained by improving an SE attention module, the SE attention module is a lightweight attention mechanism module which is proposed by Hu et al in 2017 and is used in the field of computer vision, the MSE multi-scale attention mechanism module can be conveniently inserted between two network layers of a feature extraction network, an interested feature channel is selected and emphasized by learning global information, and irrelevant interference information is inhibited.
An MSE multi-scale attention mechanism module is constructed and inserted between a Concat layer and a CBM module in each CSP module of a YOLOv4 feature extraction network CSPDarknet53 to form a new MSE-CSPUnit module, and the feature extraction network of the MSE-CSPDarknet53 with attention information is obtained. The specific steps of the construction of the MSE multi-scale attention mechanism module are as follows:
(1-1-1) firstly, taking the output of the Concat layer of the CSP module as an input feature map, integrating feature maps of various scales through convolution kernels of different sizes, and carrying out next-step feature extraction operation based on the feature maps of various scales. The convolution kernel sizes are 3 × 3, 5 × 5, and 7 × 7, respectively, and in the case of a parameter amount explosion caused by using a large-size convolution kernel, 2 layers of convolution kernels of 3 × 3 are used instead of the convolution kernels of 5 × 5, and 3 layers of convolution kernels of 3 × 3 are used instead of the convolution kernels of 7 × 7. Let input characteristic diagram X ∈ RC×H×WC, H, W are input channel, input height, and input width, respectively, the process of feature extraction using convolution kernels of different sizes for the input feature map is as follows:
Xc=V3×3X+V5×5X+V7×7X
wherein, XcFor multi-scale feature map output, V represents the convolution operation using convolution kernels of different sizes.
(1-1-2) to XcPerforming extrusion operation, and respectively extruding the channels by using global average pooling and global maximum pooling to obtain channel-level feature information, wherein the global features of the global average pooling emphasis feature map and the local features of the global maximum pooling emphasis feature map are as follows:
Figure BDA0003032012480000041
Xmax=max(Xc(i,j))
wherein, XcFor input of multi-scale features, XavgFor features obtained after global averaging pooling, XmaxFor the features obtained after the global maximum pooling, i is 1,2, …, H, j is 1,2, …, W, H, W are input height and input width, respectively.
(1-1-3) pairs of XavgAnd XmaxExcitation operation is carried out, and channel attention weight information X is generated through addition and normalization operations. Preserving more non-linear relationships between channels using Mish activation function, FC, when performing stimulus operations1、FC2Is two different fully connected layers, wherein
Figure BDA0003032012480000042
C is input channel, r is dimension reduction ratio, FC1Plays a role of reducing dimension to reduce full connection layer parameter, FC2And the function of restoring the original dimension is realized. The activation and normalization operations are as follows:
Xa=FC2(Mish(FC1(Xavg))
Xm=FC2(Mish(FC1(Xmax))
Xs=Softmax(Xa+Xm)
wherein Mish is a nonlinear activation function, and Softmax is a normalization function.
(1-1-4) performing weighting operation on the channel attention weight information generated in (1-1-3) and the multi-scale feature map generated in (1-1-1) to obtain output X of the MSE multi-scale attention moduleweightIs mixing XweightAs input to the CBM module in the MSE-CSPUnit module.
Xweight=Scale(Xc,Xs)
(1-2) adding shallow feature map as prediction layer
The deep features have stronger semantic information and are more suitable for positioning; and shallow layer characteristics have rich resolution information, which is more beneficial to the detection of small targets. Deleting the feature maps of 19 multiplied by 19 output by the FPN and PAN structures, and keeping the original output feature maps of 38 multiplied by 38 and 76 multiplied by 76 of the FPN and PAN structures; performing feature fusion on the output of MSE-CSPUnit 2 and the result of sampling on the lower deep feature map by using an FPN structure and a PAN structure to obtain a shallow feature map with the size of 152 x 152; finally, three feature maps of different sizes, namely 38 × 38, 76 × 76 and 152 × 152, are obtained to predict the targets of different scales.
Here MSE-CSPUnit x2 refers to two MSE-CSPUnit modules.
(1-3) SPP Module improvements
The SPP module can enrich the expression capability of the feature diagram and provide important context information. In order to improve the performance of small target detection, the SPP modules are respectively placed in front of the 38 × 38, 76 × 76 and 152 × 152 feature maps, so that the effective fusion of local features and global features is realized. The SPP module performs maximum pooling operations of 1 × 1, 5 × 5, 9 × 9 and 13 × 13 on the input feature map, and then performs tensor stitching on the generated feature maps with different scales.
(2) Training and optimizing small target detection networks
Aiming at a specific application scene, a small target detection data set is constructed, multi-mode random adjustment pairs are carried out on image data through data enhancement, the number of small targets, the image brightness, the contrast and the saturation in the data are randomly adjusted, and the generalization performance of a model is enhanced.
Finally, setting an anchor frame for fitting a target in the data set; and re-clustering the anchor frame of the target data set through a Kmeans + + algorithm to obtain anchor frame parameters more suitable for the current data set, and accelerating the convergence speed of the network.
(3) Model lightweight for small target detection network
(3-1) channel pruning
And (4) performing channel pruning on the small target detection network aiming at the parameter redundancy of the network. Using gamma of a convolution module BN layer of the YOLOv4 as a scaling factor, adding an L1 regularization item related to the gamma of the BN layer into a loss function, carrying out sparse training for a preset number of times on the network, sorting the gamma based on a gamma value after gradient updating, and removing a channel where the gamma is smaller than a pruning threshold value by setting the pruning threshold value to obtain the light-weight YOLOv4 network after pruning. In the YOLOv4 network, except for the convolution layer and the SPP structure before the upsampling layer, channel pruning is performed on other convolution modules containing the BN layer to obtain a model file and a model structure configuration file after the channel pruning. For the YOLOv4 sparse training, the established target loss function is:
Figure BDA0003032012480000061
where x is the input value of the model, y is the desired output value, w is the trainable parameter in the network, g (.) is the penalty term for the scaling factor, and λ is the balance factor.
(3-2) knowledge distillation recovery model accuracy
After channel pruning, although the removed channel contributes slightly to the model output, the model accuracy after pruning is slightly reduced, so that the model accuracy is recovered.
Knowledge distillation was performed using the YOLOv4 network without pruning as the teacher network and the network after channel pruning as the student network. The knowledge distillation of YOLOv4 will perform the learning of the classification task and the regression task, for the distillation of the regression results, it is not direct to the teacher's network learning when calculating the regression loss, since the output of the regression is unbounded and the predicted results of the teacher's network may be opposite to the label values. First, the teacher network and tag values, student network and tag value L2 losses are calculated separately, and a range w is set, and when the L2 loss of student network and tag value deviates from the teacher network and tag value L2 loss by more than the range w, the L2 loss of student network is accounted for in the loss. That is, when the performance of the student network exceeds a certain value of the teacher network, the loss of the student network is not calculated. The overall loss function is:
Figure BDA0003032012480000071
Lreg=(1-v)LsL1(Rs,yreg)+vLb(Rs,Rt,yreg)
wherein w is a preset deviation range, yregIs the true tag value, RtAnd RsRegression outputs, L, for teachers and students respectivelybPartial loss for model distillation, LsL1For loss of student network and true tags, v is LbAnd LsL1The balance factor is set between 0.1 and 0.5 percent of the time before the network training and 0.6 to 0.9 percent of the training time after the network training; l isregIs the total loss in learning by net distillation.
(4) Detection of input images using trained small target detection network models
And inputting a frame of aerial image of the unmanned aerial vehicle, and sending the aerial image into a trained and optimized small target detection network for positioning and classifying targets. The network firstly inputs the image into a feature extraction network with an attention mechanism to extract features, and 3 feature graphs with different resolution ratios are respectively output through an SPP module. Detecting the targets with three different scales on the 3 characteristic graphs by using a regression and classification idea, and obtaining classification and positioning results of the targets after filtering by using a confidence threshold; and repeating until the detection of the pictures in the test set is completed.
Compared with the prior art, the invention has the following beneficial effects:
compared with the traditional small target detection method, the method has the advantages that the MSE attention module is designed based on SE (secure element) and is inserted into the YOLOv4 feature extraction network, the attention capacity of the network to the region of interest is enhanced, and the interference of a complex background in the small target detection process is reduced; then adding a shallow feature map as a prediction layer, and predicting targets with different scales by using feature maps with different sizes of 38 × 38, 76 × 76 and 152 × 152; the SPP module is improved, and the SPP modules are respectively placed in front of the feature maps of 38 multiplied by 38, 76 multiplied by 76 and 152 multiplied by 152, so that the effective fusion of local features and global features is realized; finally, compression optimization is carried out on the model by using channel pruning and knowledge distillation strategies, and the large-scale compression of the quantity of the model parameters is realized with little precision loss; in addition, the number of small targets, the brightness, the contrast and the saturation of the image in the data set are randomly adjusted by using a data enhancement mode, and the training effect of the model is enhanced. In small target data concentration, the network has better detection effect and robustness, and simultaneously meets the requirement of light weight model deployment.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a MSE-CSPUnit module after adding an MSE multi-scale attention mechanism module;
FIG. 3 is a MSE multi-scale attention module structure of the present invention;
FIG. 4 is a small target detection network architecture designed by the present invention;
FIG. 5 is a comparison of the number of channels after compression of the model, where the dark columns are before pruning and the light columns are after pruning;
fig. 6 is a diagram of the detection effect of the small target detection network on the target picture according to the present invention, wherein (a), (c) are the detection effects before improvement, and (b) and (d) are the detection effects after improvement corresponding to (a), (c).
Detailed Description
The present invention will be described in detail below with reference to examples and drawings, but the present invention is not limited thereto. The target detection embodiment of the invention is implemented by various small targets in a data set, the selected processing platform is a combination of Intel i9-9900k, NVIDIA RTX2080ti and 32G RAM, and the operating system is Linux64 Ubuntu 18.04. The method is realized on a deep learning frame Pytrich1.6.
The method for detecting the light-weight small target with the attention mechanism introduced as shown in FIG. 1 comprises four parts:
(1) building an improved small target detection network based on YOLOv 4;
(2) training and optimizing the small target detection network;
(3) carrying out model lightweight on the small target detection network;
(4) and detecting the input image by using the trained small target detection network model.
The first part of building an improved small target detection network based on YOLOv4 specifically comprises the following steps:
(1-1) designing an MSE multi-scale attention mechanism module, and embedding the MSE multi-scale attention mechanism module into a feature extraction network
An MSE multi-scale attention mechanism module is constructed and inserted between a Concat layer and a CBM module in each CSP module of a YOLOv4 feature extraction network CSPDarknet53 to form a new MSE-CSPUnit module, so that the feature extraction network of the MSE-CSPDarknet53 with attention information is obtained, and as shown in FIG. 2, the rest modules except the MSE are conventional structure modules of a YOLOv4 feature extraction network CSPDarknet 53. The MSE multi-scale attention mechanism module is constructed as follows:
firstly, the output of a Concat layer of a CSP module is used as an input feature map, feature maps of various scales are integrated through convolution kernels of different sizes, and next feature extraction operation is carried out on the basis of the multi-scale feature maps, wherein the sizes of the convolution kernels are respectively 3 × 3, 5 × 5 and 7 × 7. In the case of a parameter amount explosion caused by using a large-size convolution kernel, 2 layers of 3 × 3 convolution kernels are used instead of the 5 × 5 convolution kernels, and 3 layers of 3 × 3 convolution kernels are used instead of the 7 × 7 convolution kernels. Let input characteristic diagram X ∈ RC×H×WC, H, W are input channel, input height, and input width, respectively, the process of feature extraction using convolution kernels of different sizes for the input feature map is as follows:
Xc=V3×3X+V5×5X+V7×7X
wherein, XcFor multi-scale fused feature output, V represents convolution operations using convolution kernels of different sizes.
To XcPerforming extrusion operation, and aiming at the characteristic that the small target feature information is less, using global maximum pooling operation to emphasize local information of the feature map, and simultaneously using global average pooling operation to emphasize global features of the feature map, wherein the pooling operation is as follows:
Figure BDA0003032012480000101
Xmax=max(Xc(i,j))
wherein, XavgFor features obtained after global averaging pooling, XmaxFor the features obtained after the global maximum pooling, i is 1,2, …, H, j is 1,2, …, W, H, W are input height and input width, respectively.
Are respectively paired with XavgAnd XmaxPerforming excitation operation, adding, and normalizing to generate attention weight information Xs. The use of the Mish activation function preserves more of the non-linear relationship between channels when performing the excitation operation. FC1、FC2Is two different fully connected layers, wherein
Figure BDA0003032012480000102
Figure BDA0003032012480000103
C is input channel, r is dimension reduction ratio, FC1Plays a role of reducing dimension to reduce full connection layer parameter, FC2And the function of restoring the original dimension is realized. The activation and normalization operations are as follows:
Xa=FC2(Mish(FC1(Xavg))
Xm=FC2(Mish(FC1(Xmax))
Xs=Softmax(Xa+Xm)
wherein Mish is a nonlinear activation function, and Softmax is a normalization function.
Mixing XsWith the multiscale feature map X generated in the first stepcPerforming weighting operation to obtain output X of MSE multi-scale attention moduleweightIs mixing XweightAs input to the CBM module in the MSE-CSPUnit module.
Xweight=Scale(Xc,Xs)
(1-2) adding shallow features in the predicted layer
The deep features have stronger semantic information and are more suitable for positioning; and shallow layer characteristics have rich resolution information, which is more beneficial to the detection of small targets. Deleting the feature maps of 19 multiplied by 19 output by the FPN and PAN structures, and keeping the original output feature maps of 38 multiplied by 38 and 76 multiplied by 76 of the FPN and PAN structures; performing feature fusion on the output of MSE-CSPUnit 2 and the result of sampling on the lower deep feature map by using an FPN structure and a PAN structure to obtain a shallow feature map with the size of 152 x 152; finally, three feature maps of different sizes, namely 38 × 38, 76 × 76 and 152 × 152, are obtained to predict the targets of different scales.
(1-3) SPP Module improvements
The SPP module can enrich the expression capability of the feature diagram and provide important context information. In order to improve the performance of small target detection, the SPP modules are respectively placed in front of the 38 × 38, 76 × 76 and 152 × 152 feature maps, so that the effective fusion of local features and global features is realized. The SPP module performs maximum pooling operations of 1 × 1, 5 × 5, 9 × 9 and 13 × 13 on the input feature map, and then performs tensor stitching on the generated feature maps with different scales.
The second part of training and optimizing the small target detection network specifically comprises:
(2-1) construction of data set
Firstly, a small target data set is constructed, and an unmanned aerial vehicle aerial photography data set VisDrone2019 is selected in an experiment. The VisDrone2019 data set is in an unmanned aerial vehicle aerial shooting mode, so that a large number of small objects and dense objects are contained, and in addition, illumination change and object shielding are difficult points of the data set. Simultaneously because unmanned aerial vehicle image is the perpendicular reason of shooing, it is less to detect the object and contain the characteristic. For example, for pedestrian detection, the ground captured image may contain features of human arms, legs, etc., while for drone images, there may be features of the top of the head.
(2-2) data enhancement and multimodal stochastic adjustment of picture data
And during network training, improving the training effect of the small target by adopting an online enhancement mode on the data set. Since the data set may contain fewer pictures of small targets, the model may be biased toward medium and large sized targets during training. The data online enhancement is realized by copying a plurality of small targets in the picture, increasing the times of small objects appearing in the picture manually, and increasing the probability of the small targets contained by the anchor, so that the model can also have an opportunity to obtain more small target training samples in the training process. And meanwhile, randomly rotating and scaling the picture, and adjusting the brightness, the contrast and the saturation to increase the robustness of the model.
(2-3) custom Anchor Box for fitting targets in data sets
For target detection of extreme scale objects, a suitable anchor frame may more accurately fit objects in the data set. And for the unmanned aerial vehicle aerial photography data set, re-clustering the anchor frame of the target data set through a Kmeans + + algorithm to obtain the anchor frame parameters more suitable for the current data set. The anchor box parameters obtained by the Kmeans + + algorithm are (1,4), (2,8), (4,13), (4,5), (8,20), (9,9), (16,29), (16,15), (35, 42).
The third part of small target detection network model lightweight specifically comprises:
(3-1) channel pruning
And (4) performing channel pruning on the small target detection network aiming at the parameter redundancy of the network. Using gamma of a convolution module BN layer of the YOLOv4 as a scaling factor, adding an L1 regularization item related to gamma of the BN layer in a loss function, conducting preset rounds of sparsification training on the network for several times, such as 300 rounds of sparsification training, sorting gamma based on a gamma value after gradient updating, and removing a channel where gamma smaller than a pruning threshold is located by setting the pruning threshold to obtain a light-weight YOLOv4 network after pruning. In the YOLOv4 network, except for the convolutional layer and the SPP structure before the upsampling layer, channel pruning is performed on other convolutional modules containing the BN layer. And selecting the channel cutting proportion through multiple experiments to achieve better balance between speed and precision, finally selecting the cutting proportion to be 0.7, and obtaining the model file and the model structure configuration file after channel pruning.
(3-2) knowledge distillation recovery model accuracy
After channel pruning, although the removed channel contributes slightly to the model output, the model accuracy after pruning is slightly reduced, so that the model accuracy is recovered.
Knowledge distillation was performed using the YOLOv4 network without pruning as the teacher network and the network after channel pruning as the student network. The knowledge distillation of YOLOv4 will perform the learning of classification tasks and regression tasks, for the distillation of regression results, it is not direct to teacher's web learning when calculating the regression loss, since the output of regression is unbounded and the predicted results of teacher's web may be opposite to the true values. Firstly, the distances L2 between the teacher network and the label value and between the student network and the label value are respectively calculated, the deviation range w is set to be 0.3 through multiple experimental comparisons, and when the distance L2 between the student network and the label value and the deviation between the teacher network and the label value exceed the range w, the L2 loss of the student network can be added in the loss. That is, when the performance of the student network exceeds a certain value of the teacher network, the loss of the student network is not calculated. The overall loss function is:
Figure BDA0003032012480000131
Lreg=(1-v)LsL1(Rs,yreg)+vLb(Rs,Rt,yreg)
wherein w is a preset deviation range, yregIs the true tag value, RtAnd RsRegression outputs, L, for teachers and students respectivelybPartial loss for model distillation, LsL1For studentsLoss of network and real tag, v is LbAnd LsL1The balance factor is set between 0.1 and 0.5 percent of the time before the network training and 0.6 to 0.9 percent of the training time after the network training; l isregIs the total loss in learning by net distillation.
The fourth step of detecting the small picture target specifically includes:
(4-1) inputting an unmanned aerial vehicle aerial image
And (4-2) after reading an unmanned aerial vehicle aerial image, sending the image into a trained and optimized small target detection network for positioning and classifying the target. The network firstly inputs the image into a feature extraction network with an attention mechanism to extract features, and 3 feature graphs with different resolution ratios are respectively output through an SPP module. And (3) detecting the targets with three different scales by using a regression and classification idea, wherein a confidence threshold value is 0.2-0.6, the confidence threshold value is generally set to be 0.3, and after threshold value filtering, the classification and positioning results of the targets are obtained.
And (4-3) repeating the steps (4-1) to (4-2) until the detection of the pictures in the test set is completed, wherein the detection effect of various small targets is shown in fig. 6.

Claims (9)

1.一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述方法包括以下步骤:1. A lightweight small target detection method combined with an attention mechanism, characterized in that: the method comprises the following steps: (1)搭建基于YOLOv4改进的小目标检测网络;(1) Build an improved small target detection network based on YOLOv4; (2)训练并优化小目标检测网络;(2) Train and optimize the small target detection network; (3)对小目标检测网络进行模型轻量化;(3) Model lightweighting of the small target detection network; (4)利用已训练的小目标检测网络模型对输入图像进行检测。(4) Use the trained small target detection network model to detect the input image. 2.根据权利要求1所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(1)包括以下步骤:2. A lightweight small target detection method combined with an attention mechanism according to claim 1, wherein the step (1) comprises the following steps: (1-1)构建MSE多尺度注意力机制模块,插入到特征提取网络;(1-1) Build the MSE multi-scale attention mechanism module and insert it into the feature extraction network; (1-2)添加浅层特征图作为预测层;(1-2) Add a shallow feature map as a prediction layer; (1-3)SPP模块改进。(1-3) SPP module improvement. 3.根据权利要求2所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(1-1)包括以下步骤:构建MSE多尺度注意力机制模块,插入到YOLOv4特征提取网络CSPDarknet53的每个CSP模块中Concat层和CBM模块之间,组成新的MSE-CSPUnit模块,得到带有注意力信息的MSE-CSPDarknet53的特征提取网络。3. A light-weight small target detection method combined with an attention mechanism according to claim 2, wherein the step (1-1) comprises the steps of: constructing an MSE multi-scale attention mechanism module, inserting Between the Concat layer and the CBM module in each CSP module of the YOLOv4 feature extraction network CSPDarknet53, a new MSE-CSPUnit module is formed to obtain the feature extraction network of MSE-CSPDarknet53 with attention information. 4.根据权利要求2或3所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(1-1)在SE注意力机制模块基础上构建MSE多尺度注意力机制模块,包括以下步骤:4. A lightweight small target detection method combined with attention mechanism according to claim 2 or 3, characterized in that: the step (1-1) constructs MSE multi-scale on the basis of SE attention mechanism module Attention mechanism module, including the following steps: (1-1-1)将CSP模块的Concat层的输出作为输入特征X,通过不同尺寸的卷积核集成多种尺度的特征图,得到多尺度融合特征输出Xc;卷积核尺寸分别为3×3、5×5、7×7,Xc=V3×3X+V5×5X+V7×7X,其中,V代表使用不同尺寸卷积核的卷积操作;(1-1-1) The output of the Concat layer of the CSP module is used as the input feature X, and the feature maps of various scales are integrated through the convolution kernels of different sizes to obtain the multi-scale fusion feature output X c ; the convolution kernel sizes are respectively 3×3, 5×5, 7×7, X c =V 3×3 X+V 5×5 X+V 7×7 X, where V represents convolution operations using convolution kernels of different sizes; (1-1-2)对Xc进行挤压操作,使用全局平均池化和全局最大池化分别对通道进行挤压得到通道级的特征信息,其中全局平均池化注重全局特征,全局最大池注重化局部特征,(1-1-2) Squeeze X c , and use global average pooling and global maximum pooling to squeeze channels to obtain channel-level feature information. Global average pooling focuses on global features, and global maximum pooling focuses on global features. Pay attention to localized features,
Figure FDA0003032012470000021
Figure FDA0003032012470000021
Xmax=max(Xc(i,j));X max =max(X c (i,j)); 其中,Xavg为全局平均池化后获取的特征,Xmax为全局最大池化后获取的特征,i=1,2,…,H,j=1,2,…,W,H、W分别为输入高度、输入宽度;Among them, X avg is the feature obtained after global average pooling, X max is the feature obtained after global max pooling, i=1, 2,...,H, j=1,2,...,W,H,W respectively For input height, input width; (1-1-3)分别对Xavg和Xmax进行激励操作,并相加、经过归一化操作生成注意力权重信息Xs,FC1、FC2为两个不同的全连接层,其中
Figure FDA0003032012470000022
C为输入通道,r为降维比例,FC1起到降维的作用,以减少全连接层参数,FC2起恢复原始维度的作用;
(1-1-3) Perform excitation operations on X avg and X max respectively, add and normalize to generate attention weight information X s , FC 1 and FC 2 are two different fully connected layers, where
Figure FDA0003032012470000022
C is the input channel, r is the dimensionality reduction ratio, FC 1 plays the role of dimensionality reduction to reduce the parameters of the fully connected layer, and FC 2 plays the role of restoring the original dimension;
Xa=FC2(Mish(FC1(Xavg))X a =FC 2 (Mish(FC 1 (X avg )) Xm=FC2(Mish(FC1(Xmax))X m =FC 2 (Mish(FC 1 (X max )) Xs=Softmax(Xa+Xm)X s =Softmax(X a +X m ) 其中,Mish为非线性激活函数,Softmax为归一化函数;Among them, Mish is a nonlinear activation function, and Softmax is a normalization function; (1-1-4)将(1-1-3)生成的Xs与(1-1-1)生成的Xc进行加权操作,得到MSE多尺度注意力模块的输出Xweight,Xweight=Scale(Xc,Xs),将Xweight作为MSE-CSPUnit模块中CBM模块的输入。(1-1-4) Perform a weighting operation on the X s generated by (1-1-3) and the X c generated by (1-1-1) to obtain the output X weight of the MSE multi-scale attention module, X weight = Scale(X c , X s ), take X weight as the input of the CBM module in the MSE-CSPUnit module.
5.根据权利要求2所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(1-2)中,删除FPN和PAN结构输出的19×19大小的特征图,保留FPN和PAN结构原有38×38、76×76的输出特征图;使用FPN和PAN结构将MSE-CSPUnit*2的输出和下方深层特征图上采样的结果进行特征融合,获得152×152大小的浅层特征图;最后得到38×38、76×76、152×152三个不同大小的特征图对不同尺度的目标进行预测。5. A lightweight small target detection method combined with an attention mechanism according to claim 2, characterized in that: in the step (1-2), delete the 19×19 size of the output of the FPN and PAN structures. Feature map, retain the original 38×38 and 76×76 output feature maps of the FPN and PAN structures; use the FPN and PAN structures to fuse the output of MSE-CSPUnit*2 and the result of the upsampling of the deep feature map below, and obtain 152 ×152 size shallow feature map; finally, three different size feature maps of 38 × 38, 76 × 76, and 152 × 152 are obtained to predict targets of different scales. 6.根据权利要求6所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(1-3)中,在FPN和PAN结构和对应的三个预测层间分别放置SPP模块,SPP模块将输入特征图进行1×1、5×5、9×9、13×13的最大池化操作后,再将生成的不同尺度的特征图进行张量拼接。6. A lightweight small target detection method combined with an attention mechanism according to claim 6, characterized in that: in the step (1-3), in the FPN and PAN structures and the corresponding three prediction layers The SPP modules are placed between them, and the SPP module performs the maximum pooling operation of 1×1, 5×5, 9×9, and 13×13 on the input feature map, and then performs tensor splicing of the generated feature maps of different scales. 7.根据权利要求1所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(2)包括以下步骤:7. A lightweight small target detection method combined with an attention mechanism according to claim 1, wherein the step (2) comprises the following steps: (2-1)构建小目标数据集;(2-1) Build a small target dataset; (2-2)数据增强,并对图片数据进行多模式随机调整;(2-2) Data enhancement, and multi-mode random adjustment of picture data; (2-3)设置锚框,用于拟合数据集中的目标。(2-3) Set anchor boxes for fitting the targets in the dataset. 8.根据权利要求1所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(3)包括以下步骤:8. A lightweight small target detection method combined with an attention mechanism according to claim 1, wherein the step (3) comprises the following steps: (3-1)通道剪枝(3-1) Channel pruning 选用BN层的γ作为缩放因子,在损失函数中添加关于BN层的γ的L1正则化项,对网络进行预设轮数次的稀疏化训练后,基于梯度更新后的γ值,对除了上采样层前的卷积层、SPP模块之外的层进行通道剪枝,得到通道剪枝后的模型文件和模型结构配置文件;The γ of the BN layer is selected as the scaling factor, the L1 regularization term about the γ of the BN layer is added to the loss function, and the network is sparsely trained for a preset number of rounds. The convolutional layer before the sampling layer and the layers other than the SPP module perform channel pruning to obtain the model file and model structure configuration file after channel pruning; (3-2)知识蒸馏恢复网络精度(3-2) Knowledge distillation restores network accuracy 以未进行剪枝的YOLOv4网络作为教师网络,通道剪枝后的网络作为学生网络;分别计算教师网络和标签值、学生网络和标签值的L2损失,设置偏差范围,当学生网络和标签值的L2损失与教师网络和标签值的L2损失的偏差超过范围w时,在总损失中计入学生网络的L2损失,整体损失函数为Take the unpruned YOLOv4 network as the teacher network, and the network after channel pruning as the student network; calculate the L2 loss of the teacher network and the label value, the student network and the label value respectively, and set the deviation range. When the deviation of the L2 loss from the L2 loss of the teacher network and the label value exceeds the range w, the L2 loss of the student network is included in the total loss, and the overall loss function is
Figure FDA0003032012470000041
Figure FDA0003032012470000041
Lreg=(1-v)LsL1(Rs,yreg)+vLb(Rs,Rt,yreg)L reg =(1-v)L sL1 (R s ,y reg )+vL b (R s ,R t ,y reg ) 其中,w为预设的偏差范围,yreg是标签值,Rt和Rs分别是教师网络和学生网络的回归输出,Lb为模型蒸馏部分损失,LsL1为学生网络回归输出与标签值的之间的损失,v是Lb和LsL1之间的平衡因子,在网络训练前80%的时间设置在0.1~0.5之间,后20%的训练时间设置在0.6~0.9之间;Lreg为网络蒸馏学习时的总损失。Among them, w is the preset deviation range, y reg is the label value, R t and R s are the regression output of the teacher network and the student network, respectively, L b is the loss of the model distillation part, and L sL1 is the regression output and label value of the student network. The loss between L b and L sL1, v is the balance factor between L b and L sL1 , which is set between 0.1 and 0.5 for 80% of the time before network training, and between 0.6 and 0.9 for the last 20% of the training time; L reg is the total loss during network distillation learning.
9.根据权利要求1所述的一种结合注意力机制的轻量级小目标检测方法,其特征在于:所述步骤(4)包括以下步骤:9. A lightweight small target detection method combined with an attention mechanism according to claim 1, wherein the step (4) comprises the following steps: (4-1)输入一帧图像;(4-1) Input a frame of image; (4-2)在读取完一幅图像后,送入训练并优化完成的小目标检测网络中进行目标的定位和分类;将图像输入至带有注意力机制的特征提取网络进行特征的提取,经过SPP模块分别输出3个不同分辨率大小的特征图,对3个特征图进行三种不同尺度目标的检测,设置置信阈值为0.2~0.6,经过阈值过滤之后,获得目标的分类和定位结果;(4-2) After reading an image, send it to the trained and optimized small target detection network to locate and classify the target; input the image to the feature extraction network with attention mechanism for feature extraction , after the SPP module outputs three feature maps of different resolutions respectively, and detects three different scale targets on the three feature maps, and sets the confidence threshold to 0.2~0.6. After threshold filtering, the classification and positioning results of the targets are obtained. ; (4-3)重复步骤(4-1)至步骤(4-2),直至完成测试集中图片的检测。(4-3) Repeat steps (4-1) to (4-2) until the detection of pictures in the test set is completed.
CN202110432768.6A 2021-04-21 2021-04-21 Lightweight small target detection method combined with attention mechanism Active CN113065558B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110432768.6A CN113065558B (en) 2021-04-21 2021-04-21 Lightweight small target detection method combined with attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110432768.6A CN113065558B (en) 2021-04-21 2021-04-21 Lightweight small target detection method combined with attention mechanism

Publications (2)

Publication Number Publication Date
CN113065558A true CN113065558A (en) 2021-07-02
CN113065558B CN113065558B (en) 2024-03-22

Family

ID=76567333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110432768.6A Active CN113065558B (en) 2021-04-21 2021-04-21 Lightweight small target detection method combined with attention mechanism

Country Status (1)

Country Link
CN (1) CN113065558B (en)

Cited By (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002848A (en) * 2018-07-05 2018-12-14 西华大学 A kind of detection method of small target based on Feature Mapping neural network
CN113408549A (en) * 2021-07-14 2021-09-17 西安电子科技大学 Few-sample weak and small target detection method based on template matching and attention mechanism
CN113486990A (en) * 2021-09-06 2021-10-08 北京字节跳动网络技术有限公司 Training method of endoscope image classification model, image classification method and device
CN113642402A (en) * 2021-07-13 2021-11-12 重庆科技学院 Image target detection method based on deep learning
CN113743514A (en) * 2021-09-08 2021-12-03 庆阳瑞华能源有限公司 Knowledge distillation-based target detection method and target detection terminal
CN113780406A (en) * 2021-09-08 2021-12-10 福州大学 A method for detecting end faces of bundled logs based on YOLO
CN113807311A (en) * 2021-09-29 2021-12-17 中国人民解放军国防科技大学 A multi-scale target recognition method
CN113837144A (en) * 2021-10-25 2021-12-24 广州微林软件有限公司 Intelligent image data acquisition and processing method for refrigerator
CN113902744A (en) * 2021-12-10 2022-01-07 湖南师范大学 Lightweight network-based image detection method, system, device and storage medium
CN113962882A (en) * 2021-09-29 2022-01-21 西安交通大学 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
CN114022705A (en) * 2021-10-29 2022-02-08 电子科技大学 An adaptive object detection method based on scene complexity pre-classification
CN114037888A (en) * 2021-11-05 2022-02-11 中国人民解放军国防科技大学 Joint attention and adaptive NMS (network management System) -based target detection method and system
CN114067437A (en) * 2021-11-17 2022-02-18 山东大学 A method and system for out-of-tube detection based on positioning and video surveillance data
CN114092820A (en) * 2022-01-20 2022-02-25 城云科技(中国)有限公司 Target detection method and moving target tracking method applying same
CN114120154A (en) * 2021-11-23 2022-03-01 宁波大学 An automatic detection method for glass curtain wall damage of high-rise buildings
CN114169501A (en) * 2021-12-02 2022-03-11 深圳市华尊科技股份有限公司 Neural network compression method and related equipment
CN114220032A (en) * 2021-12-21 2022-03-22 一拓通信集团股份有限公司 A method for small target detection in UAV video based on channel cropping
CN114240869A (en) * 2021-12-10 2022-03-25 浙江大学 Rotating target detection method based on Transformer
CN114283402A (en) * 2021-11-24 2022-04-05 西北工业大学 License plate detection method based on knowledge distillation training and space-time combined attention
CN114299154A (en) * 2021-11-23 2022-04-08 上海电机学院 Method for installing lock pin into container corner fitting based on vision system
CN114332598A (en) * 2021-11-23 2022-04-12 上海电机学院 Visual system for disassembling lock pin of container
CN114359382A (en) * 2021-11-16 2022-04-15 海宁集成电路与先进制造研究院 Cooperative target ball detection method based on deep learning and related device
CN114419410A (en) * 2022-01-25 2022-04-29 中国农业银行股份有限公司 Target detection method, device, equipment and storage medium
CN114463686A (en) * 2022-04-11 2022-05-10 西南交通大学 Method and system for moving target detection based on complex background
CN114529817A (en) * 2022-02-21 2022-05-24 东南大学 Unmanned aerial vehicle photovoltaic fault diagnosis and positioning method based on attention neural network
CN114550032A (en) * 2022-01-28 2022-05-27 中国科学技术大学 Video smoke detection method of end-to-end three-dimensional convolution target detection network
CN114549959A (en) * 2022-02-28 2022-05-27 西安电子科技大学广州研究院 Infrared dim target real-time detection method and system based on target detection model
CN114550013A (en) * 2022-02-24 2022-05-27 佛山市南海区广工大数控装备协同创新研究院 A SimROD-based Aerial Image Detection and Recognition Method
CN114565896A (en) * 2022-01-05 2022-05-31 西安电子科技大学 A cross-layer fusion improved YOLOv4 road target recognition algorithm
CN114594461A (en) * 2022-03-14 2022-06-07 杭州电子科技大学 Sonar target detection method based on attention perception and scaling factor pruning
CN114648645A (en) * 2022-03-24 2022-06-21 苏州科达科技股份有限公司 Target detection method, training method of target detection model and electronic equipment
CN114663654A (en) * 2022-05-26 2022-06-24 西安石油大学 Improved YOLOv4 network model and small target detection method
CN114708231A (en) * 2022-04-11 2022-07-05 常州大学 Sugarcane aphid target detection method based on light-weight YOLO v5
CN114783021A (en) * 2022-04-07 2022-07-22 广州杰赛科技股份有限公司 Intelligent detection method, device, equipment and medium for wearing of mask
CN114862844A (en) * 2022-06-13 2022-08-05 合肥工业大学 Infrared small target detection method based on feature fusion
CN114882205A (en) * 2022-04-01 2022-08-09 西安电子科技大学 Target detection method based on attention mechanism
CN114972181A (en) * 2022-04-15 2022-08-30 西安理工大学 Heavy part coating surface defect detection method based on multi-scale detection
CN115019169A (en) * 2022-05-31 2022-09-06 海南大学 Single-stage water surface small target detection method and device
CN115082869A (en) * 2022-07-07 2022-09-20 燕山大学 A vehicle-road collaborative multi-target detection method and system for special vehicles
CN115131653A (en) * 2022-06-28 2022-09-30 南京信息工程大学 Target detection network optimization method, target detection method, device and storage medium
CN115170930A (en) * 2022-07-13 2022-10-11 广州科语机器人有限公司 Sample unbalanced target detection method based on improved YOLOX model
CN115223041A (en) * 2022-06-21 2022-10-21 中国电子科技集团公司第五十二研究所 Ship important feature identification method based on deep learning
CN115311223A (en) * 2022-08-03 2022-11-08 国网江苏省电力有限公司电力科学研究院 Multi-scale fusion intelligent power grid inspection method and device
CN115331384A (en) * 2022-08-22 2022-11-11 重庆科技学院 Operation platform fire accident early warning system based on edge calculation
CN115375668A (en) * 2022-09-07 2022-11-22 西安电子科技大学 Infrared Single Frame Small Target Detection Method Based on Attention Mechanism
CN115424154A (en) * 2022-11-01 2022-12-02 速度时空信息科技股份有限公司 Data enhancement and training method for unmanned aerial vehicle image target detection
CN115471746A (en) * 2022-08-26 2022-12-13 中船航海科技有限责任公司 A ship target recognition and detection method based on deep learning
CN115482523A (en) * 2022-10-11 2022-12-16 长春工业大学 Small object target detection method and system of lightweight multi-scale attention mechanism
CN115546500A (en) * 2022-10-31 2022-12-30 西安交通大学 A small target detection method in infrared images
CN115618271A (en) * 2022-05-05 2023-01-17 腾讯科技(深圳)有限公司 Object type identification method, device, equipment and storage medium
CN115761549A (en) * 2022-11-30 2023-03-07 航天时代飞鸿技术有限公司 A method and system for incremental detection and identification of weak targets with small samples of drones
CN115984172A (en) * 2022-11-29 2023-04-18 上海师范大学 A Small Target Detection Method Based on Enhanced Feature Extraction
CN116206185A (en) * 2023-02-27 2023-06-02 山东浪潮科学研究院有限公司 Lightweight small target detection method based on improved YOLOv7
CN116205967A (en) * 2023-04-27 2023-06-02 中国科学院长春光学精密机械与物理研究所 Medical image semantic segmentation method, device, equipment and medium
CN116245860A (en) * 2023-03-16 2023-06-09 福州大学 A small target detection method based on super-resolution-yolo network
CN116306886A (en) * 2023-03-15 2023-06-23 电子科技大学 Channel Attention-Based Convolutional Neural Network Pruning Approach for Text Detection in Natural Scenes
CN116363138A (en) * 2023-06-01 2023-06-30 湖南大学 A Lightweight Integrated Recognition Method for Garbage Sorting Images
CN116401547A (en) * 2023-04-04 2023-07-07 桂林电子科技大学 A lightweight low-light target detection method
CN116433566A (en) * 2021-12-30 2023-07-14 上海新微技术研发中心有限公司 Chip metal round hole detection method and device, storage medium and terminal
CN116883980A (en) * 2023-09-04 2023-10-13 国网湖北省电力有限公司超高压公司 Ultraviolet light insulator target detection method and system
CN116894983A (en) * 2023-09-05 2023-10-17 云南瀚哲科技有限公司 Fine-grained agricultural pest and disease image recognition method and system based on knowledge distillation
CN116912890A (en) * 2023-09-14 2023-10-20 国网江苏省电力有限公司常州供电分公司 Substation bird detection method and device
CN116935334A (en) * 2023-06-14 2023-10-24 安徽工程大学 Electric vehicle rider helmet wearing monitoring system based on unmanned aerial vehicle aerial photography
CN117218505A (en) * 2023-09-25 2023-12-12 佳源科技股份有限公司 A method for identifying substation status indicators based on deep learning
CN117444455A (en) * 2023-11-30 2024-01-26 浙江中路金属股份有限公司 Reinforcing mesh welding positioning system
CN117496509A (en) * 2023-12-25 2024-02-02 江西农业大学 Yolov7 grapefruit counting method integrating multi-teacher knowledge distillation
US11915474B2 (en) 2022-05-31 2024-02-27 International Business Machines Corporation Regional-to-local attention for vision transformers
CN117953192A (en) * 2024-01-09 2024-04-30 北京地铁建筑设施维护有限公司 A ceiling disease early warning method and image acquisition device
CN118470576A (en) * 2024-07-09 2024-08-09 齐鲁空天信息研究院 A small target detection method and system for unmanned aerial vehicle images
CN118781478A (en) * 2024-09-11 2024-10-15 浙江省智能船舶研究院有限公司 An unmanned boat obstacle recognition method based on image analysis and its model construction method
CN119580066A (en) * 2024-11-19 2025-03-07 重庆邮电大学 A target detection method based on lightweight deep learning
CN119622592A (en) * 2024-12-10 2025-03-14 安徽工业大学 A device status diagnosis method and system for multi-channel fusion neural network, electronic device, and computer-readable medium
CN120564137A (en) * 2025-07-31 2025-08-29 山东港口烟台港集团有限公司 A real-time dangerous goods identification system and method for shipping logistics

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257794A (en) * 2020-10-27 2021-01-22 东南大学 A Lightweight Object Detection Method Based on YOLO
CN112329721A (en) * 2020-11-26 2021-02-05 上海电力大学 Remote sensing small target detection method with lightweight model design
US20210049423A1 (en) * 2019-07-31 2021-02-18 Zhejiang University Efficient image classification method based on structured pruning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210049423A1 (en) * 2019-07-31 2021-02-18 Zhejiang University Efficient image classification method based on structured pruning
CN112257794A (en) * 2020-10-27 2021-01-22 东南大学 A Lightweight Object Detection Method Based on YOLO
CN112329721A (en) * 2020-11-26 2021-02-05 上海电力大学 Remote sensing small target detection method with lightweight model design

Cited By (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002848B (en) * 2018-07-05 2021-11-05 西华大学 A weak and small target detection method based on feature mapping neural network
CN109002848A (en) * 2018-07-05 2018-12-14 西华大学 A kind of detection method of small target based on Feature Mapping neural network
CN113642402A (en) * 2021-07-13 2021-11-12 重庆科技学院 Image target detection method based on deep learning
CN113408549A (en) * 2021-07-14 2021-09-17 西安电子科技大学 Few-sample weak and small target detection method based on template matching and attention mechanism
CN113408549B (en) * 2021-07-14 2023-01-24 西安电子科技大学 Few-sample Weak Object Detection Method Based on Template Matching and Attention Mechanism
CN113486990B (en) * 2021-09-06 2021-12-21 北京字节跳动网络技术有限公司 Training method of endoscope image classification model, image classification method and device
CN113486990A (en) * 2021-09-06 2021-10-08 北京字节跳动网络技术有限公司 Training method of endoscope image classification model, image classification method and device
CN113780406A (en) * 2021-09-08 2021-12-10 福州大学 A method for detecting end faces of bundled logs based on YOLO
CN113743514A (en) * 2021-09-08 2021-12-03 庆阳瑞华能源有限公司 Knowledge distillation-based target detection method and target detection terminal
CN113807311A (en) * 2021-09-29 2021-12-17 中国人民解放军国防科技大学 A multi-scale target recognition method
CN113962882A (en) * 2021-09-29 2022-01-21 西安交通大学 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
CN113962882B (en) * 2021-09-29 2023-08-25 西安交通大学 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
CN113837144A (en) * 2021-10-25 2021-12-24 广州微林软件有限公司 Intelligent image data acquisition and processing method for refrigerator
CN113837144B (en) * 2021-10-25 2022-09-13 广州微林软件有限公司 Intelligent image data acquisition and processing method for refrigerator
CN114022705A (en) * 2021-10-29 2022-02-08 电子科技大学 An adaptive object detection method based on scene complexity pre-classification
CN114022705B (en) * 2021-10-29 2023-08-04 电子科技大学 An adaptive object detection method based on pre-classification of scene complexity
CN114037888B (en) * 2021-11-05 2024-03-08 中国人民解放军国防科技大学 Target detection method and system based on joint attention and adaptive NMS
CN114037888A (en) * 2021-11-05 2022-02-11 中国人民解放军国防科技大学 Joint attention and adaptive NMS (network management System) -based target detection method and system
CN114359382A (en) * 2021-11-16 2022-04-15 海宁集成电路与先进制造研究院 Cooperative target ball detection method based on deep learning and related device
CN114067437A (en) * 2021-11-17 2022-02-18 山东大学 A method and system for out-of-tube detection based on positioning and video surveillance data
CN114067437B (en) * 2021-11-17 2024-04-16 山东大学 Method and system for detecting pipe removal based on positioning and video monitoring data
CN114120154A (en) * 2021-11-23 2022-03-01 宁波大学 An automatic detection method for glass curtain wall damage of high-rise buildings
CN114299154A (en) * 2021-11-23 2022-04-08 上海电机学院 Method for installing lock pin into container corner fitting based on vision system
CN114332598A (en) * 2021-11-23 2022-04-12 上海电机学院 Visual system for disassembling lock pin of container
CN114283402A (en) * 2021-11-24 2022-04-05 西北工业大学 License plate detection method based on knowledge distillation training and space-time combined attention
CN114283402B (en) * 2021-11-24 2024-03-05 西北工业大学 License plate detection method based on knowledge distillation training and space-time combined attention
CN114169501A (en) * 2021-12-02 2022-03-11 深圳市华尊科技股份有限公司 Neural network compression method and related equipment
CN114240869A (en) * 2021-12-10 2022-03-25 浙江大学 Rotating target detection method based on Transformer
CN113902744B (en) * 2021-12-10 2022-03-08 湖南师范大学 Lightweight network-based image detection method, system, device and storage medium
CN113902744A (en) * 2021-12-10 2022-01-07 湖南师范大学 Lightweight network-based image detection method, system, device and storage medium
CN114220032A (en) * 2021-12-21 2022-03-22 一拓通信集团股份有限公司 A method for small target detection in UAV video based on channel cropping
CN116433566A (en) * 2021-12-30 2023-07-14 上海新微技术研发中心有限公司 Chip metal round hole detection method and device, storage medium and terminal
CN114565896A (en) * 2022-01-05 2022-05-31 西安电子科技大学 A cross-layer fusion improved YOLOv4 road target recognition algorithm
CN114092820A (en) * 2022-01-20 2022-02-25 城云科技(中国)有限公司 Target detection method and moving target tracking method applying same
CN114419410A (en) * 2022-01-25 2022-04-29 中国农业银行股份有限公司 Target detection method, device, equipment and storage medium
CN114550032A (en) * 2022-01-28 2022-05-27 中国科学技术大学 Video smoke detection method of end-to-end three-dimensional convolution target detection network
CN114529817A (en) * 2022-02-21 2022-05-24 东南大学 Unmanned aerial vehicle photovoltaic fault diagnosis and positioning method based on attention neural network
CN114529817B (en) * 2022-02-21 2024-11-12 东南大学 Photovoltaic fault diagnosis and positioning method for unmanned aerial vehicles based on attention neural network
CN114550013A (en) * 2022-02-24 2022-05-27 佛山市南海区广工大数控装备协同创新研究院 A SimROD-based Aerial Image Detection and Recognition Method
CN114549959A (en) * 2022-02-28 2022-05-27 西安电子科技大学广州研究院 Infrared dim target real-time detection method and system based on target detection model
CN114594461A (en) * 2022-03-14 2022-06-07 杭州电子科技大学 Sonar target detection method based on attention perception and scaling factor pruning
CN114594461B (en) * 2022-03-14 2025-09-16 杭州电子科技大学 Sonar target detection method based on attention perception and scaling factor pruning
CN114648645A (en) * 2022-03-24 2022-06-21 苏州科达科技股份有限公司 Target detection method, training method of target detection model and electronic equipment
CN114882205A (en) * 2022-04-01 2022-08-09 西安电子科技大学 Target detection method based on attention mechanism
CN114783021A (en) * 2022-04-07 2022-07-22 广州杰赛科技股份有限公司 Intelligent detection method, device, equipment and medium for wearing of mask
CN114708231A (en) * 2022-04-11 2022-07-05 常州大学 Sugarcane aphid target detection method based on light-weight YOLO v5
CN114463686A (en) * 2022-04-11 2022-05-10 西南交通大学 Method and system for moving target detection based on complex background
CN114972181A (en) * 2022-04-15 2022-08-30 西安理工大学 Heavy part coating surface defect detection method based on multi-scale detection
CN115618271A (en) * 2022-05-05 2023-01-17 腾讯科技(深圳)有限公司 Object type identification method, device, equipment and storage medium
CN115618271B (en) * 2022-05-05 2023-11-17 腾讯科技(深圳)有限公司 Object category identification method, device, equipment and storage medium
CN114663654A (en) * 2022-05-26 2022-06-24 西安石油大学 Improved YOLOv4 network model and small target detection method
US11915474B2 (en) 2022-05-31 2024-02-27 International Business Machines Corporation Regional-to-local attention for vision transformers
CN115019169A (en) * 2022-05-31 2022-09-06 海南大学 Single-stage water surface small target detection method and device
CN114862844A (en) * 2022-06-13 2022-08-05 合肥工业大学 Infrared small target detection method based on feature fusion
CN114862844B (en) * 2022-06-13 2023-08-08 合肥工业大学 A Method of Infrared Small Target Detection Based on Feature Fusion
CN115223041A (en) * 2022-06-21 2022-10-21 中国电子科技集团公司第五十二研究所 Ship important feature identification method based on deep learning
CN115131653A (en) * 2022-06-28 2022-09-30 南京信息工程大学 Target detection network optimization method, target detection method, device and storage medium
CN115082869B (en) * 2022-07-07 2023-09-15 燕山大学 A vehicle-road collaborative multi-target detection method and system serving special vehicles
CN115082869A (en) * 2022-07-07 2022-09-20 燕山大学 A vehicle-road collaborative multi-target detection method and system for special vehicles
CN115170930A (en) * 2022-07-13 2022-10-11 广州科语机器人有限公司 Sample unbalanced target detection method based on improved YOLOX model
CN115170930B (en) * 2022-07-13 2025-09-23 广州科语机器人有限公司 Sample imbalance target detection method based on improved YOLOX model
CN115311223A (en) * 2022-08-03 2022-11-08 国网江苏省电力有限公司电力科学研究院 Multi-scale fusion intelligent power grid inspection method and device
CN115331384A (en) * 2022-08-22 2022-11-11 重庆科技学院 Operation platform fire accident early warning system based on edge calculation
CN115471746A (en) * 2022-08-26 2022-12-13 中船航海科技有限责任公司 A ship target recognition and detection method based on deep learning
CN115375668A (en) * 2022-09-07 2022-11-22 西安电子科技大学 Infrared Single Frame Small Target Detection Method Based on Attention Mechanism
CN115375668B (en) * 2022-09-07 2025-07-04 西安电子科技大学 Infrared single-frame small target detection method based on attention mechanism
CN115482523A (en) * 2022-10-11 2022-12-16 长春工业大学 Small object target detection method and system of lightweight multi-scale attention mechanism
CN115546500A (en) * 2022-10-31 2022-12-30 西安交通大学 A small target detection method in infrared images
CN115424154A (en) * 2022-11-01 2022-12-02 速度时空信息科技股份有限公司 Data enhancement and training method for unmanned aerial vehicle image target detection
CN115984172A (en) * 2022-11-29 2023-04-18 上海师范大学 A Small Target Detection Method Based on Enhanced Feature Extraction
CN115761549A (en) * 2022-11-30 2023-03-07 航天时代飞鸿技术有限公司 A method and system for incremental detection and identification of weak targets with small samples of drones
CN116206185A (en) * 2023-02-27 2023-06-02 山东浪潮科学研究院有限公司 Lightweight small target detection method based on improved YOLOv7
CN116306886A (en) * 2023-03-15 2023-06-23 电子科技大学 Channel Attention-Based Convolutional Neural Network Pruning Approach for Text Detection in Natural Scenes
CN116245860A (en) * 2023-03-16 2023-06-09 福州大学 A small target detection method based on super-resolution-yolo network
CN116245860B (en) * 2023-03-16 2025-07-01 福州大学 A small target detection method based on super-resolution-yolo network
CN116401547A (en) * 2023-04-04 2023-07-07 桂林电子科技大学 A lightweight low-light target detection method
CN116205967A (en) * 2023-04-27 2023-06-02 中国科学院长春光学精密机械与物理研究所 Medical image semantic segmentation method, device, equipment and medium
CN116363138A (en) * 2023-06-01 2023-06-30 湖南大学 A Lightweight Integrated Recognition Method for Garbage Sorting Images
CN116363138B (en) * 2023-06-01 2023-08-22 湖南大学 A Lightweight Integrated Recognition Method for Garbage Sorting Images
CN116935334A (en) * 2023-06-14 2023-10-24 安徽工程大学 Electric vehicle rider helmet wearing monitoring system based on unmanned aerial vehicle aerial photography
CN116883980A (en) * 2023-09-04 2023-10-13 国网湖北省电力有限公司超高压公司 Ultraviolet light insulator target detection method and system
CN116894983B (en) * 2023-09-05 2023-11-21 云南瀚哲科技有限公司 Knowledge distillation-based fine-grained agricultural pest image identification method and system
CN116894983A (en) * 2023-09-05 2023-10-17 云南瀚哲科技有限公司 Fine-grained agricultural pest and disease image recognition method and system based on knowledge distillation
CN116912890B (en) * 2023-09-14 2023-11-24 国网江苏省电力有限公司常州供电分公司 Method and device for detecting birds in transformer substation
CN116912890A (en) * 2023-09-14 2023-10-20 国网江苏省电力有限公司常州供电分公司 Substation bird detection method and device
CN117218505A (en) * 2023-09-25 2023-12-12 佳源科技股份有限公司 A method for identifying substation status indicators based on deep learning
CN117444455A (en) * 2023-11-30 2024-01-26 浙江中路金属股份有限公司 Reinforcing mesh welding positioning system
CN117496509B (en) * 2023-12-25 2024-03-19 江西农业大学 Yolov7 grapefruit counting method integrating multi-teacher knowledge distillation
CN117496509A (en) * 2023-12-25 2024-02-02 江西农业大学 Yolov7 grapefruit counting method integrating multi-teacher knowledge distillation
CN117953192A (en) * 2024-01-09 2024-04-30 北京地铁建筑设施维护有限公司 A ceiling disease early warning method and image acquisition device
CN118470576A (en) * 2024-07-09 2024-08-09 齐鲁空天信息研究院 A small target detection method and system for unmanned aerial vehicle images
CN118781478B (en) * 2024-09-11 2024-11-12 浙江省智能船舶研究院有限公司 An unmanned boat obstacle recognition method based on image analysis and its model construction method
CN118781478A (en) * 2024-09-11 2024-10-15 浙江省智能船舶研究院有限公司 An unmanned boat obstacle recognition method based on image analysis and its model construction method
CN119580066A (en) * 2024-11-19 2025-03-07 重庆邮电大学 A target detection method based on lightweight deep learning
CN119622592A (en) * 2024-12-10 2025-03-14 安徽工业大学 A device status diagnosis method and system for multi-channel fusion neural network, electronic device, and computer-readable medium
CN120564137A (en) * 2025-07-31 2025-08-29 山东港口烟台港集团有限公司 A real-time dangerous goods identification system and method for shipping logistics

Also Published As

Publication number Publication date
CN113065558B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN111126472B (en) An Improved Target Detection Method Based on SSD
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN108764372B (en) Data set construction method and device, mobile terminal and readable storage medium
CN108416307B (en) Method, device and equipment for detecting pavement cracks of aerial images
CN106446930B (en) Robot working scene recognition method based on deep convolutional neural network
CN112150493B (en) Semantic guidance-based screen area detection method in natural scene
CN108596108B (en) Aerial remote sensing image change detection method based on triple semantic relation learning
CN114972208A (en) YOLOv 4-based lightweight wheat scab detection method
CN111126184B (en) Post-earthquake building damage detection method based on unmanned aerial vehicle video
CN110689021A (en) Real-time target detection method in low-visibility environment based on deep learning
CN106897673A (en) A kind of recognition methods again of the pedestrian based on retinex algorithms and convolutional neural networks
CN109360179B (en) Image fusion method and device and readable storage medium
CN107633226A (en) A kind of human action Tracking Recognition method and system
CN110879982A (en) Crowd counting system and method
CN108665509A (en) A kind of ultra-resolution ratio reconstructing method, device, equipment and readable storage medium storing program for executing
CN110222718A (en) The method and device of image procossing
CN113569981A (en) A power inspection bird's nest detection method based on single-stage target detection network
CN104850857B (en) Across the video camera pedestrian target matching process of view-based access control model spatial saliency constraint
CN111340019A (en) Detection method of granary pests based on Faster R-CNN
CN111738099A (en) An automatic face detection method based on video image scene understanding
CN108133235A (en) A kind of pedestrian detection method based on neural network Analysis On Multi-scale Features figure
CN112418262A (en) Vehicle re-identification method, client and system
CN112668662B (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN104463962B (en) Three-dimensional scene reconstruction method based on GPS information video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant