CN108564097B - Multi-scale target detection method based on deep convolutional neural network - Google Patents
Multi-scale target detection method based on deep convolutional neural network Download PDFInfo
- Publication number
- CN108564097B CN108564097B CN201711267789.7A CN201711267789A CN108564097B CN 108564097 B CN108564097 B CN 108564097B CN 201711267789 A CN201711267789 A CN 201711267789A CN 108564097 B CN108564097 B CN 108564097B
- Authority
- CN
- China
- Prior art keywords
- network
- layer
- model
- classification
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 31
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 21
- 230000006870 function Effects 0.000 claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 29
- 238000011176 pooling Methods 0.000 claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000013507 mapping Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims description 20
- 230000004913 activation Effects 0.000 claims description 13
- 238000012795 verification Methods 0.000 claims description 6
- 238000010200 validation analysis Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 108091026890 Coding region Proteins 0.000 claims description 2
- 230000004927 fusion Effects 0.000 claims description 2
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 claims description 2
- 238000004321 preservation Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 claims description 2
- 230000000087 stabilizing effect Effects 0.000 claims description 2
- 238000000844 transformation Methods 0.000 claims description 2
- 230000008447 perception Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及计算机图像处理的技术领域,尤其是指一种基于深度卷积神经网络的多尺度目标检测方法。The invention relates to the technical field of computer image processing, in particular to a multi-scale target detection method based on a deep convolutional neural network.
背景技术Background technique
目标检测与识别是计算机视觉计算领域的重要课题之一。随着人类科学技术的发展,目标检测这一重要技术不断地得到充分利用,人们把它运用到各种场景中,实现各种预期目标,如战场警戒、安全检测、交通管制、视频监控等都方面。Object detection and recognition is one of the important topics in the field of computer vision computing. With the development of human science and technology, the important technology of target detection is constantly being fully utilized. People use it in various scenarios to achieve various expected goals, such as battlefield warning, security detection, traffic control, video surveillance, etc. aspect.
近些年,随着深度学习的快速发展,深度卷积神经网络在目标检测与识别技术上也有进一步的突破。利用深度卷积神经网络,可以提取到图片的高层语义特征信息,然后再利用这些高层语义信息进行目标的检测。神经网络越深,其所表达的特征信息就更具有代表性,但是其存在的问题是,对小尺度物体则表达的非常粗糙,甚至会使得小尺度物体的部分特征丢失,而且,神经网络对大小尺度非常敏感,不同大小尺度的物体经过神经网络所提取到的特征信息存在很大的差异性,导致小尺度物体检测的准确率低,从而大大降低了目标检测的鲁棒性和有效性。In recent years, with the rapid development of deep learning, deep convolutional neural networks have also made further breakthroughs in target detection and recognition technology. Using the deep convolutional neural network, the high-level semantic feature information of the picture can be extracted, and then the high-level semantic information can be used to detect the target. The deeper the neural network is, the more representative the feature information it expresses, but the problem is that it expresses very coarsely for small-scale objects, and even some features of small-scale objects are lost. The size and scale are very sensitive, and the feature information extracted by the neural network of objects of different sizes and scales has great differences, resulting in low accuracy of small-scale object detection, which greatly reduces the robustness and effectiveness of target detection.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于克服现有技术的缺点与不足,提出了一种深度卷积神经网络的多尺度目标检测方法,该方法可以很好的将大小尺度的目标检测出来,突破了之前方法中无法很好检测出大小尺度差异很大的同类目标的限制。The purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art, and propose a multi-scale target detection method based on a deep convolutional neural network. It is good at detecting the limitation of homogeneous objects with large differences in size and scale.
为实现上述目的,本发明所提供的技术方案为:一种基于深度卷积神经网络的多尺度目标检测方法,包括以下步骤:In order to achieve the above purpose, the technical solution provided by the present invention is: a multi-scale target detection method based on a deep convolutional neural network, comprising the following steps:
1)数据获取1) Data acquisition
训练深度卷积神经网络需要大量的训练数据,因此需要使用大规模的自然图像或视频图像数据,如果得到的图像数据没有标签数据则需要进行人工标注,然后划分为训练数据集以及验证数据集;Training a deep convolutional neural network requires a lot of training data, so large-scale natural image or video image data needs to be used. If the obtained image data does not have label data, it needs to be manually labeled, and then divided into training data sets and verification data sets;
2)数据处理2) Data processing
将图像数据集的图像和标签数据通过预处理转化为训练深度卷积神经网络所需要的格式;Convert the image and label data of the image dataset into the format required for training a deep convolutional neural network through preprocessing;
3)模型构建3) Model building
根据训练目标以及模型的输入输出形式,构造出一个适用于多尺度目标检测问题的深度卷积神经网络;According to the training target and the input and output form of the model, a deep convolutional neural network suitable for multi-scale target detection is constructed;
4)定义损失函数4) Define the loss function
根据训练目标以及模型的架构,定义出所需的损失函数;Define the required loss function according to the training target and the architecture of the model;
5)模型训练5) Model training
初始化各层网络的参数,不断迭代输入训练样本,根据损失函数计算得到网络的损失值,再通过反向传播计算出各网络层参数的梯度,通过随机梯度下降法对各层网络的参数进行更新;Initialize the parameters of each layer of the network, iteratively input training samples, calculate the loss value of the network according to the loss function, and then calculate the gradient of the parameters of each network layer through back propagation, and update the parameters of each layer of the network through the stochastic gradient descent method. ;
6)模型验证6) Model Validation
使用验证数据集对训练得到的模型进行验证,测试其泛化性能。Use the validation dataset to validate the trained model and test its generalization performance.
所述步骤2)包括以下步骤:Described step 2) comprises the following steps:
2.1)将数据集中的图像缩放到长和宽为m×n像素大小,标签数据也根据相应的比例缩放到相应的大小;2.1) Scale the images in the dataset to the size of m×n pixels in length and width, and the label data is also scaled to the corresponding size according to the corresponding ratio;
2.2)在缩放后的图像,随机裁剪包含有标签的地方得到a×b像素大小的矩形图像,a<=m,b<=n;2.2) In the scaled image, randomly crop the place containing the label to obtain a rectangular image of a×b pixel size, a<=m, b<=n;
2.3)以0.5的概率随机水平翻转裁剪后的图像;2.3) Randomly flip the cropped image horizontally with a probability of 0.5;
2.4)将随机翻转后的图像从[0,255]转换到[-1,1]的范围内。2.4) Convert the randomly flipped image from [0, 255] to the range of [-1, 1].
所述步骤3)包括以下步骤:Described step 3) comprises the following steps:
3.1)构造特征提取网络模型3.1) Construct feature extraction network model
特征提取网络相当于一个编码器,用于从输入的图像中提取出高层的语义信息并保存到一个低维的编码中,特征提取网络的输入为经过步骤2)处理的图像,小物体在越深层的编码中会丢失部分信息,因此为了减少保全更多的信息,输出低维和较低维的特征编码;为了实现从输入到一系列输出的转换,特征提取网络包含多个级联的下采样层,下采样层由串联的卷积层、批量正则化层、以及非线性激活函数层、池化层组成,其中卷积层步长为1,卷积核大小为3×3,提取出相应的特征图,批量正则化层通过归一化同一个批次的输入样本的均值和标准差,起到稳定和加速模型训练的作用,非线性激活函数层的加入防止模型退化为简单的线性模型,提高模型的描述能力,池化层的作用是缩小特征图的大小,这样能够增加卷积核的感受野;The feature extraction network is equivalent to an encoder, which is used to extract high-level semantic information from the input image and save it into a low-dimensional encoding. The input of the feature extraction network is the image processed in step 2). Some information will be lost in the deep coding, so in order to reduce the preservation of more information, low-dimensional and low-dimensional feature codes are output; in order to realize the transformation from input to a series of outputs, the feature extraction network contains multiple cascaded downsampling The downsampling layer consists of convolutional layers, batch regularization layers, nonlinear activation function layers, and pooling layers in series. The batch regularization layer plays a role in stabilizing and accelerating model training by normalizing the mean and standard deviation of the input samples of the same batch. The addition of a nonlinear activation function layer prevents the model from degenerating into a simple linear model. , to improve the description ability of the model, the role of the pooling layer is to reduce the size of the feature map, which can increase the receptive field of the convolution kernel;
3.2)构造区域生成网络模型3.2) Constructing a regional generative network model
区域生成网络负责找到输入图中所有的物体和它们的位置;区域生成网络输入特征图,然后把这个特征图上的每一个点映射回原图,得到这些点的坐标,再在这些点周围取一些提前设定好的不同大小不同长宽比例的候选框,并计算出每个框是物体的概率分数;其中,区域生成网络的输入为步骤3.1)特征提取网络的输出,输出一系列候选框的坐标和一系列候选框是物体的概率分数;The region generation network is responsible for finding all the objects and their positions in the input image; the region generation network inputs the feature map, and then maps each point on the feature map back to the original image, obtains the coordinates of these points, and then takes the coordinates around these points. Some candidate boxes of different sizes and different aspect ratios are set in advance, and the probability score that each box is an object is calculated; among them, the input of the region generation network is the output of step 3.1) feature extraction network, and a series of candidate boxes are output. The coordinates of and a series of candidate boxes are the probability scores of the object;
为了实现从输入到输出的一系列转换,区域生成网络模型包括3个串联的功能结构,有卷积层、批量正则化层、非线性激活函数层,第一个功能结构是将输入进行3×3大小的特征融合,融合周边的信息,并分别作为第二和第三个功能结构的输入,第二个功能结构实现输出矩形框的坐标信息,第三个功能结构实现输出对应矩形框是物体的概率分数;In order to realize a series of transformations from input to output, the region generation network model includes 3 functional structures in series, including convolutional layers, batch regularization layers, and nonlinear activation function layers. The first functional structure is to perform 3 × The feature fusion of 3 sizes, fuses the surrounding information, and serves as the input of the second and third functional structures respectively. The second functional structure realizes the output of the coordinate information of the rectangular frame, and the third functional structure realizes the output corresponding to the rectangular frame is an object. probability score;
3.3)构造有内容感知能力的感兴趣区域池化层3.3) Construct a content-aware region of interest pooling layer
有内容感知能力的感兴趣区域池化层的作用是实现从原图的目标区域映射到所述步骤3.1)得到的低维编码区域,再池化到固定大小的功能,而有内容感知能力则表现在以下两方面:The role of the content-aware region of interest pooling layer is to map the target region of the original image to the low-dimensional coding region obtained in step 3.1), and then pool to a fixed size, while the content-aware capability It is manifested in the following two aspects:
3.3.1)信息补全3.3.1) Information Completion
信息补全是为了补全小目标在低维编码时丢失的信息,让小目标的检测更准确;针对从原图的目标区域映射到所述步骤3.1)的低维编码的特征图,若其长和宽其中一个大于z,z的取值根据网络需求而定,另一个小于z,则通过反卷积的方式将其放大到边长为max(长,宽)的正方形,再进行池化操作;若其长和宽都小于z,则长宽通过反卷积的方式放大到原来的2倍,再进行池化操作;若其长和宽都大于z,则直接进行后续的池化操作;Information completion is to complete the information lost by small targets during low-dimensional coding, so that the detection of small targets is more accurate; for the low-dimensional coding feature map mapped from the target area of the original image to the step 3.1), if its One of the length and width is greater than z, the value of z is determined according to the network requirements, and the other is less than z, then it is enlarged to a square whose side length is max (length, width) by deconvolution, and then pooled Operation; if its length and width are both less than z, the length and width are enlarged to twice the original size by deconvolution, and then the pooling operation is performed; if both its length and width are greater than z, the subsequent pooling operation is performed directly ;
3.3.2)划分大小3.3.2) Division size
对所述步骤3.2)输出原图的目标区域进行划分大小,根据准备的训练数据集中所有标签框的面积的均值,若所述步骤3.2)输出的矩形框的面积小于该均值,标记为小目标输出,而大于或等于该均值的,标记为大目标输出;Divide the size of the target area of the original image output in the step 3.2), according to the average value of the area of all the label boxes in the prepared training data set, if the area of the rectangular box output in the step 3.2) is smaller than the average value, mark it as a small target output, and those greater than or equal to the mean are marked as large target outputs;
3.4)构造多任务分类网络3.4) Constructing a multi-task classification network
多任务分类网络是为了分别识别大尺度和小尺度的目标,防止大和小尺度的目标的低维编码不同导致的分类错误;根据步骤3.3)得到的大小两类矩形框,分别输入两个分类网络;分类网络输出类别的分数用以分类任务,以及精修选框的位置用于回归任务,为了完成分类和回归任务,该网络包含全连接层、非线性激活函数层、信号丢失层,全连接层起到将学到的“分布式特征表示”映射到样本标记空间的作用,非线性激活函数层的加入防止了模型退化为简单的线性模型,提高模型的描述能力,信号丢失层以0.5的概率让神经元不工作,让训练过程收敛更快,防止过拟合;The multi-task classification network is to identify large-scale and small-scale targets respectively, and prevent classification errors caused by different low-dimensional codes of large-scale and small-scale targets; according to the two types of rectangular boxes obtained in step 3.3), input the two classification networks respectively. ; The score of the output category of the classification network is used for the classification task, and the position of the refinement box is used for the regression task. In order to complete the classification and regression tasks, the network includes a fully connected layer, a nonlinear activation function layer, a signal loss layer, and a fully connected layer. The layer plays the role of mapping the learned "distributed feature representation" to the sample label space. The addition of the nonlinear activation function layer prevents the model from degenerating into a simple linear model and improves the description ability of the model. The signal loss layer is 0.5 Probability makes neurons not work, makes the training process converge faster, and prevents overfitting;
最后将大小分类网络的输出结果进行融合,作为最终输出;Finally, the output results of the size classification network are fused as the final output;
所述步骤4)包括以下步骤:Described step 4) comprises the following steps:
4.1)定义区域生成网络的损失函数4.1) Define the loss function of the region generation network
区域生成网络用于在低维的编码中得到输入图感兴趣区域的坐标和该区域是否为前景的分数,即回归任务和分类任务,定义损失函数使输出的选框尽可能的接近标准参考框的位置;因此,回归任务的损失函数能够定义为平滑化曼哈顿距离损失损失(SmoothL1Loss),公式如下所示:The region generation network is used to obtain the coordinates of the region of interest in the input image and the score of whether the region is foreground in the low-dimensional encoding, that is, the regression task and the classification task, and the loss function is defined to make the output box as close to the standard reference box as possible. ; therefore, the loss function for the regression task can be defined as the smoothed Manhattan distance loss loss (SmoothL1Loss), the formula is as follows:
其中,Lreg为回归损失,v和t分别表示预测框的位置和其对应的标准参考框的位置,x和y表示左上角坐标值,w和h分别表示矩形框的宽和高;Among them, L reg is the regression loss, v and t respectively represent the position of the prediction frame and the position of its corresponding standard reference frame, x and y represent the upper left corner coordinate value, and w and h represent the width and height of the rectangular frame, respectively;
分类任务的损失函数定义为柔性最大化损失(SoftmaxLoss),公式如下所示:The loss function of the classification task is defined as the soft maximization loss (SoftmaxLoss), and the formula is as follows:
x'i=x'i-max(x'1,...,x'n)x' i =x' i -max(x' 1 ,...,x' n )
Lcls=-logpi L cls = -logpi
其中,x'为网络的输出,n表示总类别数,p表示每一类的概率,Lcls为分类损失;Among them, x' is the output of the network, n is the total number of categories, p is the probability of each category, and L cls is the classification loss;
4.2)定义分类网络的损失函数4.2) Define the loss function of the classification network
分类网络输出类别的分数用于分类任务,以及精修选框的位置用于回归任务,定义损失函数使其输出的类别尽可能的和标签数据一致,同时使其输出的选框位置尽可能的和标准参考框的位置一致;同样如步骤4.1),回归任务的损失函数能够定义为SmoothL1Loss,分类任务的损失函数定义为SoftmaxLoss;The score of the output category of the classification network is used for the classification task, and the position of the refinement box is used for the regression task. The loss function is defined so that the output category is as consistent as possible with the label data, and the output box position is as close as possible. The position of the standard reference frame is the same; as in step 4.1), the loss function of the regression task can be defined as SmoothL1Loss, and the loss function of the classification task can be defined as SoftmaxLoss;
4.3)定义总损失函数4.3) Define the total loss function
步骤4.1)和步骤4.2)中定义的两个区域生成网络损失函数与两个分类网络损失函数能够通过加权的方式组合起来,使得网络可以完成图片中多尺度目标检测的任务;The two area generation network loss functions defined in step 4.1) and step 4.2) and the two classification network loss functions can be combined in a weighted manner, so that the network can complete the task of multi-scale target detection in the picture;
所述步骤5)包括以下步骤:Described step 5) comprises the following steps:
5.1)初始化模型各层参数5.1) Initialize the parameters of each layer of the model
各层参数的初始化采用的是传统的深度卷积神经网络中使用到的方法,对特征提取网络的卷积层参数利用在ImageNet预训练好的VGG16网络模型的卷积层参数值作为初始值,区域生成网络中的卷积层以及分类网络的全连接层,则采用均值为0,标准差为0.02的高斯分布进行初始化,而对所有的批量正则化层的参数采用均值为1,标准差为0.02的高斯分布进行初始化;The initialization of the parameters of each layer adopts the method used in the traditional deep convolutional neural network. For the convolutional layer parameters of the feature extraction network, the convolutional layer parameter values of the VGG16 network model pre-trained in ImageNet are used as the initial values. The convolutional layer in the region generation network and the fully connected layer of the classification network are initialized with a Gaussian distribution with a mean of 0 and a standard deviation of 0.02, while the parameters of all batch regularization layers are initialized with a mean of 1 and a standard deviation of The Gaussian distribution of 0.02 is initialized;
5.2)训练网络模型5.2) Train the network model
随机输入经过步骤2)处理的原始图像,经过步骤3.1)的特征提取网络得到相应的低维编码特征,在经过步骤3.2)的区域生成网络生成一批选框的候选区域,并通过步骤4.1)计算相应的损失值,然后将这些区域经过步骤3.3)的有内容感知能力的感兴趣区域池化层得到固定大小的另一种低维编码特征,而后再经过步骤3.4)的分类网络得到目标的分类以及精修的选框位置,并通过步骤4.2)计算相应的损失值。最后将这两部分的损失值经过步骤4.3)的处理得到最终损失值,将该值通过反向传播能够得到步骤3)网络模型中的各层参数的梯度,再通过随机梯度下降算法使得到的梯度对各层参数进行优化,即可实现一轮网络模型的训练;Randomly input the original image processed in step 2), obtain the corresponding low-dimensional coding features through the feature extraction network in step 3.1), generate a batch of candidate regions for the selection box in the region generation network in step 3.2), and pass step 4.1) Calculate the corresponding loss value, and then pass these regions through the content-aware region of interest pooling layer in step 3.3) to obtain another low-dimensional encoding feature of a fixed size, and then go through the classification network in step 3.4) to obtain the target's Classification and refinement of the marquee position, and calculate the corresponding loss value through step 4.2). Finally, the loss values of these two parts are processed in step 4.3) to obtain the final loss value, and the gradient of each layer parameter in the network model in step 3) can be obtained by back-propagation of this value, and then the stochastic gradient descent algorithm is used to make the obtained The gradient optimizes the parameters of each layer to realize a round of training of the network model;
5.3)重复步骤5.2)直到网络关于多尺度目标检测的能力达到预期的目标为止。5.3) Repeat step 5.2) until the ability of the network on multi-scale object detection reaches the desired goal.
所述步骤6)的具体做法如下:The concrete practice of described step 6) is as follows:
随机从验证数据集中取出一些原始图像,经过步骤2)处理后,输入到步骤5)训练好的网络模型,让该网络模型去检测图中的目标的位置并预测其类别,通过输出的结果与对应的标签数据进行比对,从而判断该训练好的网络模型的多尺度目标检测能力。Randomly take some original images from the verification data set, and after processing in step 2), input them to the network model trained in step 5), and let the network model detect the position of the target in the figure and predict its category. The corresponding label data are compared to judge the multi-scale target detection ability of the trained network model.
本发明与现有技术相比,具有如下优点与有益效果:Compared with the prior art, the present invention has the following advantages and beneficial effects:
1、提出了新的网络层--有内容感知能力的感兴趣区域池化层(CAROIPooling,Content-Aware ROIPooling layer),实现从原图区域映射到所低维编码区域再池化到固定大小的功能,尤其会对小尺度物体的地位编码特征图进行信息补全,达到更准确和更全面的低维编码特征图的目的,而且该网络层在其他目标检测网络中一样适用。1. A new network layer is proposed--a content-aware region of interest pooling layer (CAROIPooling, Content-Aware ROIPooling layer), which realizes the mapping from the original image area to the low-dimensional coding area and then pools it to a fixed size. In particular, it will complete the information of the position-encoding feature maps of small-scale objects, so as to achieve the purpose of more accurate and comprehensive low-dimensional encoding feature maps, and this network layer is also applicable to other target detection networks.
2、提出了一个多分支的目标检测网络,不同分支分别负责大尺度和小尺度的目标检测任务,从而更加准确的区分和检测出大尺度物体和小尺度物体,突破已有方法的限制。2. A multi-branch target detection network is proposed. Different branches are responsible for large-scale and small-scale target detection tasks, so as to more accurately distinguish and detect large-scale objects and small-scale objects, breaking through the limitations of existing methods.
附图说明Description of drawings
图1为本发明方法流程图。Fig. 1 is the flow chart of the method of the present invention.
图2为特征提取网络示意图。Figure 2 is a schematic diagram of the feature extraction network.
图3为区域生成网络示意图。Figure 3 is a schematic diagram of the area generation network.
图4为分类网络示意图。Figure 4 is a schematic diagram of the classification network.
具体实施方式Detailed ways
下面结合具体实施例对本发明作进一步说明。The present invention will be further described below in conjunction with specific embodiments.
如图1所示,本实施例所提供的基于深度卷积神经网络的多尺度目标检测方法,其具体情况如下:As shown in FIG. 1 , the details of the multi-scale target detection method based on a deep convolutional neural network provided by this embodiment are as follows:
步骤1,获取高速公路视频数据集,然后获取其视频帧,进行人工标注,并划分为训练数据集以及验证数据集。Step 1: Obtain the highway video data set, then obtain its video frames, perform manual annotation, and divide it into a training data set and a verification data set.
步骤2,将图像数据集的图像和标签数据通过预处理转化为训练深度卷积神经网络所需要的格式,包括以下步骤:Step 2: Convert the image and label data of the image dataset into a format required for training a deep convolutional neural network through preprocessing, including the following steps:
步骤2.1,将数据集中的图像缩放到长和宽为768×1344像素大小,标签数据也根据相应的比例缩放到相应的大小。In step 2.1, the images in the dataset are scaled to a size of 768 × 1344 pixels in length and width, and the label data is also scaled to the corresponding size according to the corresponding scale.
步骤2.2,在缩放后的图像,随机裁剪其中包含有标签的地方得到768×768像素大小的正方形图像。Step 2.2, in the scaled image, randomly crop the place containing the label to obtain a square image with a size of 768 × 768 pixels.
步骤2.3,以0.5的概率随机水平翻转裁剪后的图像。Step 2.3, randomly flip the cropped image horizontally with a probability of 0.5.
步骤2.4,将随机翻转后的图像从[0,255]转换到[-1,1]的范围内。Step 2.4, transform the randomly flipped image from [0,255] to the range of [-1,1].
步骤3,构建网络模型,包括特征提取网络、区域生成网络、多任务分类网络,包括以下步骤:Step 3, building a network model, including a feature extraction network, a region generation network, and a multi-task classification network, including the following steps:
步骤3.1,构造特征提取网络。特征提取网络的输入为3×768×768的图像,输出为一系列低维编码特征图(512×48×48和512×24×24)。该网络包括多个级联的下采样层。下采样层由串联的卷积层、批量正则化层、以及非线性激活函数层、池化层组成。以下是一个特征提取网络模型的具体例子,如图2所示。Step 3.1, construct a feature extraction network. The input to the feature extraction network is an image of 3 × 768 × 768, and the output is a series of low-dimensional encoded feature maps (512 × 48 × 48 and 512 × 24 × 24). The network includes multiple cascaded downsampling layers. The downsampling layer is composed of convolutional layers, batch regularization layers, nonlinear activation function layers, and pooling layers in series. The following is a specific example of a feature extraction network model, as shown in Figure 2.
步骤3.2,构造区域生成网络。区域生成网络的输入为512×48×48/512×24×24的特征图,输出为36×48×48/36×24×24和18×48×48/18×24×24的矩阵信息。该网络包括3个串联的结构(卷积层、批量正则化层、非线性激活函数层)。以下是一个区域生成网络模型的具体例子,如图3所示。Step 3.2, construct the region generation network. The input of the region generation network is the feature map of 512×48×48/512×24×24, and the output is the matrix information of 36×48×48/36×24×24 and 18×48×48/18×24×24. The network consists of 3 concatenated structures (convolutional layer, batch regularization layer, non-linear activation function layer). The following is a specific example of a region generative network model, as shown in Figure 3.
步骤3.3,构造多任务分类网络。本例子用了两个分类网络,他们的输入都是长度为512×7×7的向量,输出长度为4的向量A和长度为4的向量B,其中向量A中的4个值分别表示背景、小车、公共汽车、火车的类别分数,向量B中的4个值表示了一个选框的位置(左上角点的坐标x和y,选框的宽和高w和h)。该网络包含了全连接层、非线性激活函数层,信息丢失层。以下是本例子多任务分类网络模型的具体例子,如图4所示。Step 3.3, construct a multi-task classification network. This example uses two classification networks, their input is a vector of length 512 × 7 × 7, and the output is a vector A of length 4 and a vector B of length 4, where the four values in vector A represent the background respectively. , car, bus, train class scores, the 4 values in the vector B represent the position of a marquee (coordinates x and y of the upper left point, and the width and height of the marquee w and h). The network includes a fully connected layer, a nonlinear activation function layer, and an information loss layer. The following is a specific example of the multi-task classification network model in this example, as shown in Figure 4.
步骤4,定义区域生成网络和分类网络的损失函数,包括以下步骤:Step 4, define the loss function of the region generation network and the classification network, including the following steps:
步骤4.1,定义区域生成网络的损失函数。定义损失函数使输出的选框尽可能的接近标准参考框的位置,此处用SmoothL1Loss定义损失函数使输出的选框的前景分数尽可能的与标签数据接近,此处用SoftmaxLoss。Step 4.1, define the loss function of the region generation network. Define the loss function to make the output box as close as possible to the position of the standard reference frame. Here, SmoothL1Loss is used to define the loss function to make the foreground score of the output box as close to the label data as possible, and SoftmaxLoss is used here.
步骤4.2,定义分类网络的损失函数。定义损失函数使输出的选框的前景分数尽可能的与标签数据接近,类别为4类。定义损失函数使输出的选框尽可能的接近标准参考框的位置。Step 4.2, define the loss function of the classification network. The loss function is defined so that the foreground score of the output box is as close as possible to the label data, and the category is 4 categories. Define the loss function so that the output box is as close as possible to the position of the standard reference box.
步骤4.3,定义总损失函数。对以上4个损失进行加权求和。用公式表示如下:Step 4.3, define the total loss function. Weighted summation of the above 4 losses. The formula is expressed as follows:
Loss=(w1×Lcls+w2×Lreg)区域生成网络损失+(w1×Lcls+w2×Lreg)分类网络损失 Loss=(w 1 ×L cls +w 2 ×L reg ) area generation network loss +(w 1 ×L cls +w 2 ×L reg ) classification network loss
其中,Loss为总损失值,w1、w2、w3、w4为权重,本例w1=w2=w3=w4=1,Lcls为分类损失值,Lreg为回归损失值。Among them, Loss is the total loss value, w1, w2, w3, and w4 are the weights. In this example, w1=w2=w3=w4=1, L cls is the classification loss value, and L reg is the regression loss value.
步骤5,训练网络模型,包括以下步骤:Step 5, train the network model, including the following steps:
步骤5.1,初始化模型各层参数,特征提取网络的卷积层参数利用在一个大数据库ImageNet上预训练好的VGG16网络模型的卷积层参数值作为初始值,区域生成网络中的卷积层以及分类网络的全连接层,则采用均值为0,标准差为0.02的高斯分布进行初始化,而对所有的批量正则化层的参数采用均值为1,标准差为0.02的高斯分布进行初始化。Step 5.1, initialize the parameters of each layer of the model, the convolutional layer parameters of the feature extraction network use the convolutional layer parameter values of the VGG16 network model pre-trained on a large database ImageNet as the initial value, and the convolutional layer and The fully connected layer of the classification network is initialized with a Gaussian distribution with a mean of 0 and a standard deviation of 0.02, while the parameters of all batch regularization layers are initialized with a Gaussian distribution with a mean of 1 and a standard deviation of 0.02.
步骤5.2,训练网络模型随机输入经过步骤2处理的原始图像,输入步骤3的网络模型,输出类别信息和回归框的坐标信息,再经过步骤4计算得到最终损失值,将该值通过反向传播能够得到步骤3网络模型中的各层参数的梯度,再通过随机梯度下降算法使得到的梯度对各层参数进行优化,即可实现一轮网络模型的训练。Step 5.2, train the network model to randomly input the original image processed in step 2, input the network model of step 3, output the category information and the coordinate information of the regression box, and then calculate the final loss value through step 4, and pass this value through back propagation The gradient of the parameters of each layer in the network model in step 3 can be obtained, and then the obtained gradient can be used to optimize the parameters of each layer through the stochastic gradient descent algorithm, and then a round of training of the network model can be realized.
步骤5.3,持续迭代训练,即重复步骤5.2直到网络关于多尺度目标检测的能力达到预期的目标为止。Step 5.3, continuous iterative training, that is, repeat step 5.2 until the ability of the network on multi-scale target detection reaches the expected target.
步骤6,使用验证数据集对训练得到的模型进行验证,测试其泛化性能。Step 6: Validate the trained model using the validation dataset to test its generalization performance.
具体做法是随机从验证数据集中取出一些原始图像,经过步骤2处理后,输入到步骤5训练好的网络模型,让该网络模型去检测图中的目标的位置并预测其类别。通过输出的结果与对应的标签数据进行比对,从而判断该训练好的网络模型的多尺度目标检测能力。The specific method is to randomly take some original images from the verification data set, and after processing in step 2, input them into the network model trained in step 5, and let the network model detect the position of the target in the picture and predict its category. The multi-scale target detection ability of the trained network model is judged by comparing the output results with the corresponding label data.
以上所述实施例只为本发明之较佳实施例,并非以此限制本发明的实施范围,故凡依本发明之形状、原理所作的变化,均应涵盖在本发明的保护范围内。The above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of implementation of the present invention. Therefore, any changes made according to the shape and principle of the present invention should be included within the protection scope of the present invention.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711267789.7A CN108564097B (en) | 2017-12-05 | 2017-12-05 | Multi-scale target detection method based on deep convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711267789.7A CN108564097B (en) | 2017-12-05 | 2017-12-05 | Multi-scale target detection method based on deep convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108564097A CN108564097A (en) | 2018-09-21 |
CN108564097B true CN108564097B (en) | 2020-09-22 |
Family
ID=63529242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711267789.7A Active CN108564097B (en) | 2017-12-05 | 2017-12-05 | Multi-scale target detection method based on deep convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108564097B (en) |
Families Citing this family (98)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109361617B (en) * | 2018-09-26 | 2022-09-27 | 中国科学院计算机网络信息中心 | A convolutional neural network traffic classification method and system based on network packet load |
CN109446911B (en) * | 2018-09-28 | 2021-08-06 | 北京陌上花科技有限公司 | Image detection method and system |
CN109492636B (en) * | 2018-09-30 | 2021-08-03 | 浙江工业大学 | Object detection method based on adaptive receptive field deep learning |
CN109376619B (en) * | 2018-09-30 | 2021-10-15 | 中国人民解放军陆军军医大学 | Cell detection method |
CN109525859B (en) * | 2018-10-10 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Model training method, image sending method, image processing method and related device equipment |
CN109558791B (en) * | 2018-10-11 | 2020-12-01 | 浙江大学宁波理工学院 | Bamboo shoot searching device and method based on image recognition |
CN109344806B (en) * | 2018-10-31 | 2019-08-23 | 第四范式(北京)技术有限公司 | The method and system detected using multitask target detection model performance objective |
CN109634820A (en) * | 2018-11-01 | 2019-04-16 | 华中科技大学 | A kind of fault early warning method, relevant device and the system of the collaboration of cloud mobile terminal |
CN109583321A (en) * | 2018-11-09 | 2019-04-05 | 同济大学 | The detection method of wisp in a kind of structured road based on deep learning |
CN109523015B (en) * | 2018-11-09 | 2021-10-22 | 上海海事大学 | A kind of image processing method in neural network |
CN109583483B (en) * | 2018-11-13 | 2020-12-11 | 中国科学院计算技术研究所 | A target detection method and system based on convolutional neural network |
CN111260536B (en) * | 2018-12-03 | 2022-03-08 | 中国科学院沈阳自动化研究所 | Digital image multi-scale convolution processor with variable parameters and implementation method thereof |
CN111310775B (en) * | 2018-12-11 | 2023-08-25 | Tcl科技集团股份有限公司 | Data training method, device, terminal equipment and computer readable storage medium |
CN109753995B (en) * | 2018-12-14 | 2021-01-01 | 中国科学院深圳先进技术研究院 | Optimization method of 3D point cloud target classification and semantic segmentation network based on PointNet + |
CN109753959B (en) * | 2018-12-21 | 2022-05-13 | 西北工业大学 | Pavement traffic sign detection method based on adaptive multi-scale feature fusion |
CN109766790B (en) * | 2018-12-24 | 2022-08-23 | 重庆邮电大学 | Pedestrian detection method based on self-adaptive characteristic channel |
CN109685066B (en) * | 2018-12-24 | 2021-03-09 | 中国矿业大学(北京) | Mine target detection and identification method based on deep convolutional neural network |
CN110889425A (en) * | 2018-12-29 | 2020-03-17 | 研祥智能科技股份有限公司 | Target detection method based on deep learning |
CN109726690B (en) * | 2018-12-30 | 2023-04-18 | 陕西师范大学 | Multi-region description method for learner behavior image based on DenseCap network |
CN109741318B (en) * | 2018-12-30 | 2022-03-29 | 北京工业大学 | Real-time detection method of single-stage multi-scale specific target based on effective receptive field |
CN109753927B (en) | 2019-01-02 | 2025-03-07 | 腾讯科技(深圳)有限公司 | A face detection method and device |
CN109784476B (en) * | 2019-01-12 | 2022-08-16 | 福州大学 | Method for improving DSOD network |
CN109829421B (en) * | 2019-01-29 | 2020-09-08 | 西安邮电大学 | Method and device for vehicle detection and computer readable storage medium |
CN111523351A (en) * | 2019-02-02 | 2020-08-11 | 北京地平线机器人技术研发有限公司 | Neural network training method and device and electronic equipment |
CN109977997B (en) * | 2019-02-13 | 2021-02-02 | 中国科学院自动化研究所 | Image target detection and segmentation method based on convolutional neural network rapid robustness |
CN109919214B (en) * | 2019-02-27 | 2023-07-21 | 南京地平线机器人技术有限公司 | Training method and training device for neural network model |
CN109949229A (en) * | 2019-03-01 | 2019-06-28 | 北京航空航天大学 | A multi-platform and multi-view target collaborative detection method |
CN111695380B (en) * | 2019-03-13 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | Target detection method and device |
CN110120047B (en) * | 2019-04-04 | 2023-08-08 | 平安科技(深圳)有限公司 | Image segmentation model training method, image segmentation method, device, equipment and medium |
CN109977918B (en) * | 2019-04-09 | 2023-05-02 | 华南理工大学 | An Optimization Method for Object Detection and Localization Based on Unsupervised Domain Adaptation |
CN110072119B (en) * | 2019-04-11 | 2020-04-10 | 西安交通大学 | Content-aware video self-adaptive transmission method based on deep learning network |
CN110084165B (en) * | 2019-04-19 | 2020-02-07 | 山东大学 | Intelligent identification and early warning method for abnormal events in open scene of power field based on edge calculation |
CN110070530B (en) * | 2019-04-19 | 2020-04-10 | 山东大学 | Transmission line icing detection method based on deep neural network |
CN110135480A (en) * | 2019-04-30 | 2019-08-16 | 南开大学 | A network data learning method based on unsupervised object detection to eliminate bias |
CN110215232A (en) * | 2019-04-30 | 2019-09-10 | 南方医科大学南方医院 | Ultrasonic patch analysis method in coronary artery based on algorithm of target detection |
CN110929746A (en) * | 2019-05-24 | 2020-03-27 | 南京大学 | A deep neural network-based method for location, extraction and classification of electronic file titles |
CN110288082B (en) * | 2019-06-05 | 2022-04-05 | 北京字节跳动网络技术有限公司 | Convolutional neural network model training method and device and computer readable storage medium |
CN110298387A (en) * | 2019-06-10 | 2019-10-01 | 天津大学 | Incorporate the deep neural network object detection method of Pixel-level attention mechanism |
CN110298266B (en) * | 2019-06-10 | 2023-06-06 | 天津大学 | Object detection method based on deep neural network based on multi-scale receptive field feature fusion |
CN110348437B (en) * | 2019-06-27 | 2022-03-25 | 电子科技大学 | A Target Detection Method Based on Weakly Supervised Learning and Occlusion Awareness |
CN110288586A (en) * | 2019-06-28 | 2019-09-27 | 昆明能讯科技有限责任公司 | A kind of multiple dimensioned transmission line of electricity defect inspection method based on visible images data |
CN110472483B (en) * | 2019-07-02 | 2022-11-15 | 五邑大学 | SAR image-oriented small sample semantic feature enhancement method and device |
CN110399884B (en) * | 2019-07-10 | 2021-08-20 | 浙江理工大学 | A feature fusion adaptive anchor frame model vehicle detection method |
CN110349148A (en) * | 2019-07-11 | 2019-10-18 | 电子科技大学 | A Weakly Supervised Learning-Based Image Object Detection Method |
CN111027581A (en) * | 2019-08-23 | 2020-04-17 | 中国地质大学(武汉) | A 3D target detection method and system based on learnable coding |
CN110706205B (en) * | 2019-09-07 | 2021-05-14 | 创新奇智(重庆)科技有限公司 | Method for detecting cloth hole-breaking defect by using computer vision technology |
CN110659724B (en) * | 2019-09-12 | 2023-04-28 | 复旦大学 | Construction Method of Deep Convolutional Neural Network for Target Detection Based on Target Scale |
CN112712097B (en) * | 2019-10-25 | 2024-01-05 | 杭州海康威视数字技术股份有限公司 | Image recognition method and device based on open platform and user side |
CN110909623B (en) * | 2019-10-31 | 2022-10-04 | 南京邮电大学 | Three-dimensional target detection method and three-dimensional target detector |
CN110991247B (en) * | 2019-10-31 | 2023-08-11 | 厦门思泰克智能科技股份有限公司 | Electronic component identification method based on deep learning and NCA fusion |
CN111008656B (en) * | 2019-11-29 | 2022-12-13 | 中国电子科技集团公司第二十研究所 | Target detection method based on prediction frame error multi-stage loop processing |
CN111222546B (en) * | 2019-12-27 | 2023-04-07 | 中国科学院计算技术研究所 | Multi-scale fusion food image classification model training and image classification method |
CN111178446B (en) * | 2019-12-31 | 2023-08-04 | 歌尔股份有限公司 | Optimization method and device of target classification model based on neural network |
CN111242897A (en) * | 2019-12-31 | 2020-06-05 | 北京深睿博联科技有限责任公司 | Chest X-ray image analysis method and device |
CN111241964A (en) * | 2020-01-06 | 2020-06-05 | 北京三快在线科技有限公司 | Training method and device of target detection model, electronic equipment and storage medium |
CN111242037B (en) * | 2020-01-15 | 2023-03-21 | 华南理工大学 | Lane line detection method based on structural information |
CN111275171B (en) * | 2020-01-19 | 2023-07-04 | 合肥工业大学 | A small target detection method based on multi-scale super-resolution reconstruction based on parameter sharing |
CN111274981B (en) * | 2020-02-03 | 2021-10-08 | 中国人民解放军国防科技大学 | Target detection network construction method and device and target detection method |
CN111444939B (en) * | 2020-02-19 | 2022-06-28 | 山东大学 | Small-scale equipment component detection method based on weak supervision cooperative learning in open scene of power field |
CN111340123A (en) * | 2020-02-29 | 2020-06-26 | 韶鼎人工智能科技有限公司 | Image score label prediction method based on deep convolutional neural network |
CN111445026B (en) * | 2020-03-16 | 2023-08-22 | 东南大学 | Acceleration method for deep neural network multi-path reasoning for edge intelligence applications |
CN111461190B (en) * | 2020-03-24 | 2023-03-28 | 华南理工大学 | Deep convolutional neural network-based non-equilibrium ship classification method |
CN111257341B (en) * | 2020-03-30 | 2023-06-16 | 河海大学常州校区 | Crack detection method for underwater buildings based on multi-scale features and stacked fully convolutional network |
CN111489332B (en) * | 2020-03-31 | 2023-03-17 | 成都数之联科技股份有限公司 | Multi-scale IOF random cutting data enhancement method for target detection |
CN111611846A (en) * | 2020-03-31 | 2020-09-01 | 北京迈格威科技有限公司 | Pedestrian re-identification method, device, electronic device and storage medium |
CN111553397B (en) * | 2020-04-21 | 2022-04-29 | 东南大学 | Cross-domain target detection method based on regional full convolution network and self-adaption |
CN112016542A (en) * | 2020-05-08 | 2020-12-01 | 珠海欧比特宇航科技股份有限公司 | Urban waterlogging intelligent detection method and system |
CN111597945B (en) * | 2020-05-11 | 2023-08-18 | 济南博观智能科技有限公司 | Target detection method, device, equipment and medium |
CN111931900B (en) * | 2020-05-29 | 2023-09-19 | 西安电子科技大学 | GIS discharge waveform detection method based on residual network and multi-scale feature fusion |
CN111626373B (en) * | 2020-06-01 | 2023-07-25 | 中国科学院自动化研究所 | Multi-scale widening residual network, small target recognition and detection network and its optimization method |
CN111783784A (en) * | 2020-06-30 | 2020-10-16 | 创新奇智(合肥)科技有限公司 | Method and device for detecting building cavity, electronic equipment and storage medium |
CN111860264B (en) * | 2020-07-10 | 2024-01-05 | 武汉理工大学 | Multi-task instance-level road scene understanding algorithm based on gradient equalization strategy |
CN111986126B (en) * | 2020-07-17 | 2022-05-24 | 浙江工业大学 | Multi-target detection method based on improved VGG16 network |
CN112288686B (en) * | 2020-07-29 | 2023-12-19 | 深圳市智影医疗科技有限公司 | Model training method and device, electronic equipment and storage medium |
CN112183579B (en) * | 2020-09-01 | 2023-05-30 | 国网宁夏电力有限公司检修公司 | Method, medium and system for detecting micro target |
CN112149521B (en) * | 2020-09-03 | 2024-05-07 | 浙江工业大学 | Palm print ROI extraction and enhancement method based on multitasking convolutional neural network |
CN112116079A (en) * | 2020-09-22 | 2020-12-22 | 视觉感知(北京)科技有限公司 | Solution for data transmission between neural networks |
CN112132816B (en) * | 2020-09-27 | 2022-12-30 | 北京理工大学 | Target detection method based on multitask and region-of-interest segmentation guidance |
CN112200089B (en) * | 2020-10-12 | 2021-09-14 | 西南交通大学 | Dense vehicle detection method based on vehicle counting perception attention |
CN112347967B (en) * | 2020-11-18 | 2023-04-07 | 北京理工大学 | A Pedestrian Detection Method Fused with Motion Information in Complex Scenes |
CN114547785B (en) * | 2020-11-25 | 2024-11-22 | 英业达科技有限公司 | Manufacturing equipment manufacturing parameter adjustment control system and method |
CN112348036B (en) * | 2020-11-26 | 2025-01-14 | 北京工业大学 | Adaptive object detection method based on lightweight residual learning and deconvolution cascade |
CN112560627A (en) * | 2020-12-09 | 2021-03-26 | 江苏集萃未来城市应用技术研究所有限公司 | Real-time detection method for abnormal behaviors of construction site personnel based on neural network |
CN112508016B (en) * | 2020-12-15 | 2024-04-16 | 深圳万兴软件有限公司 | Image processing method, device, computer equipment and storage medium |
CN112712133A (en) * | 2021-01-15 | 2021-04-27 | 北京华捷艾米科技有限公司 | Deep learning network model training method, related device and storage medium |
CN112836816B (en) * | 2021-02-04 | 2024-02-09 | 南京大学 | Training method suitable for crosstalk of photoelectric storage and calculation integrated processing unit |
CN113269182A (en) * | 2021-04-21 | 2021-08-17 | 山东师范大学 | Target fruit detection method and system based on small-area sensitivity of variant transform |
CN113326735B (en) * | 2021-04-29 | 2023-11-28 | 南京大学 | YOLOv 5-based multi-mode small target detection method |
CN113239775B (en) * | 2021-05-09 | 2023-05-02 | 西北工业大学 | Method for detecting and extracting tracks in azimuth lineage diagram based on hierarchical attention depth convolution neural network |
CN112990444B (en) * | 2021-05-13 | 2021-09-24 | 电子科技大学 | Hybrid neural network training method, system, equipment and storage medium |
CN113076962B (en) * | 2021-05-14 | 2022-10-21 | 电子科技大学 | Multi-scale target detection method based on micro neural network search technology |
CN113762278B (en) * | 2021-09-13 | 2023-11-17 | 中冶路桥建设有限公司 | Asphalt pavement damage identification method based on target detection |
CN114048536A (en) * | 2021-11-18 | 2022-02-15 | 重庆邮电大学 | A road structure prediction and target detection method based on multi-task neural network |
CN113902980B (en) * | 2021-11-24 | 2024-02-20 | 河南大学 | Remote sensing target detection method based on content perception |
CN114462487A (en) * | 2021-12-28 | 2022-05-10 | 浙江大华技术股份有限公司 | Target detection network training and detection method, device, terminal and storage medium |
CN114549958B (en) * | 2022-02-24 | 2023-08-04 | 四川大学 | Night and camouflage target detection method based on context information perception mechanism |
CN114687012A (en) * | 2022-02-25 | 2022-07-01 | 武汉智目智能技术合伙企业(有限合伙) | Efficient foreign fiber removing device and method for high-impurity-content raw cotton |
CN115049952B (en) * | 2022-04-24 | 2023-04-07 | 南京农业大学 | Juvenile fish limb identification method based on multi-scale cascade perception deep learning network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105320963A (en) * | 2015-10-21 | 2016-02-10 | 哈尔滨工业大学 | High resolution remote sensing image oriented large scale semi-supervised feature selection method |
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106529402A (en) * | 2016-09-27 | 2017-03-22 | 中国科学院自动化研究所 | Multi-task learning convolutional neural network-based face attribute analysis method |
CN106845430A (en) * | 2017-02-06 | 2017-06-13 | 东华大学 | Pedestrian detection and tracking based on acceleration region convolutional neural networks |
CN107103590A (en) * | 2017-03-22 | 2017-08-29 | 华南理工大学 | A kind of image for resisting generation network based on depth convolution reflects minimizing technology |
CN107341517A (en) * | 2017-07-07 | 2017-11-10 | 哈尔滨工业大学 | The multiple dimensioned wisp detection method of Fusion Features between a kind of level based on deep learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150342560A1 (en) * | 2013-01-25 | 2015-12-03 | Ultrasafe Ultrasound Llc | Novel Algorithms for Feature Detection and Hiding from Ultrasound Images |
US10002313B2 (en) * | 2015-12-15 | 2018-06-19 | Sighthound, Inc. | Deeply learned convolutional neural networks (CNNS) for object localization and classification |
-
2017
- 2017-12-05 CN CN201711267789.7A patent/CN108564097B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105320963A (en) * | 2015-10-21 | 2016-02-10 | 哈尔滨工业大学 | High resolution remote sensing image oriented large scale semi-supervised feature selection method |
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106529402A (en) * | 2016-09-27 | 2017-03-22 | 中国科学院自动化研究所 | Multi-task learning convolutional neural network-based face attribute analysis method |
CN106845430A (en) * | 2017-02-06 | 2017-06-13 | 东华大学 | Pedestrian detection and tracking based on acceleration region convolutional neural networks |
CN107103590A (en) * | 2017-03-22 | 2017-08-29 | 华南理工大学 | A kind of image for resisting generation network based on depth convolution reflects minimizing technology |
CN107341517A (en) * | 2017-07-07 | 2017-11-10 | 哈尔滨工业大学 | The multiple dimensioned wisp detection method of Fusion Features between a kind of level based on deep learning |
Non-Patent Citations (1)
Title |
---|
A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection;Zhaowei Cai.et.;《ECCV 2016》;20161231;第354-370页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108564097A (en) | 2018-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108564097B (en) | Multi-scale target detection method based on deep convolutional neural network | |
CN109977918B (en) | An Optimization Method for Object Detection and Localization Based on Unsupervised Domain Adaptation | |
CN110298266B (en) | Object detection method based on deep neural network based on multi-scale receptive field feature fusion | |
Wang et al. | An improved light-weight traffic sign recognition algorithm based on YOLOv4-tiny | |
CN113642390B (en) | Street view image semantic segmentation method based on local attention network | |
CN109934200B (en) | RGB color remote sensing image cloud detection method and system based on improved M-Net | |
CN110738207A (en) | character detection method for fusing character area edge information in character image | |
CN110276269A (en) | An Attention Mechanism Based Target Detection Method for Remote Sensing Images | |
CN114187450A (en) | A deep learning-based semantic segmentation method for remote sensing images | |
CN111860171A (en) | A method and system for detecting irregularly shaped targets in large-scale remote sensing images | |
CN117078942B (en) | Context-aware refereed image segmentation method, system, device and storage medium | |
CN116524189A (en) | High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization | |
CN117409190B (en) | Real-time infrared image target detection method, device, equipment and storage medium | |
CN115546569A (en) | An attention mechanism-based data classification optimization method and related equipment | |
CN116912708A (en) | Remote sensing image building extraction method based on deep learning | |
CN118691815A (en) | A high-quality automatic instance segmentation method for remote sensing images based on fine-tuning of the SAM large model | |
CN110852327A (en) | Image processing method, device, electronic device and storage medium | |
CN114332473A (en) | Object detection method, object detection device, computer equipment, storage medium and program product | |
Zheng et al. | Feature enhancement for multi-scale object detection | |
CN116363526A (en) | MROCNet model construction and multi-source remote sensing image change detection method and system | |
CN116012626B (en) | Material matching method, device, equipment and storage medium for building elevation image | |
Pang et al. | PTRSegNet: A Patch-to-Region Bottom–Up Pyramid Framework for the Semantic Segmentation of Large-Format Remote Sensing Images | |
CN112668662B (en) | Target detection method in wild mountain forest environment based on improved YOLOv3 network | |
CN118521791A (en) | Remote sensing image semantic segmentation method based on convolutional neural network and complete attention network | |
CN113011506A (en) | Texture image classification method based on depth re-fractal spectrum network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |