CN108537286A - A kind of accurate recognition methods of complex target based on key area detection - Google Patents
A kind of accurate recognition methods of complex target based on key area detection Download PDFInfo
- Publication number
- CN108537286A CN108537286A CN201810345899.9A CN201810345899A CN108537286A CN 108537286 A CN108537286 A CN 108537286A CN 201810345899 A CN201810345899 A CN 201810345899A CN 108537286 A CN108537286 A CN 108537286A
- Authority
- CN
- China
- Prior art keywords
- network
- key area
- sub
- complex target
- key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种基于关键区域检测的复杂目标精准识别方法,包括:使用交叉训练的方法对整个神经网络进行融合训练,使用卷积神经网络提取目标特征,使用检测子网络以锚方框作为参考检测复杂目标的关键区域,使用区域标准池化将关键区域池化为固大小的特征图,使用分类子网络对关键区域进行分类,融合各个关键区域的分类结果从而达到对目标的精准识别。整个网络包括了关键区域检测子网络和关键区域分类子网络,由检测子网络检测出复杂目标具有区分度的关键区域,再由分类子网络对关键区域进行分类,融合各区域的分类结果对整体目标进行识别。这两个子网络共享了VGG卷积神经网络提取的特征,从而使复杂目标的识别达了到快速与精准的效果。
The present invention relates to a method for accurately identifying complex targets based on key region detection, including: using cross-training methods to perform fusion training on the entire neural network, using convolutional neural networks to extract target features, and using detection sub-networks with anchor boxes as references Detect the key areas of complex targets, use regional standard pooling to pool the key areas into fixed-size feature maps, use the classification sub-network to classify the key areas, and fuse the classification results of each key area to achieve accurate recognition of the target. The entire network includes a key area detection sub-network and a key area classification sub-network. The detection sub-network detects key areas with a degree of discrimination for complex targets, and then the classification sub-network classifies the key areas. target identification. These two sub-networks share the features extracted by the VGG convolutional neural network, so that the recognition of complex targets can be achieved quickly and accurately.
Description
技术领域technical field
本发明涉及图像处理技术,特别是涉及一种基于关键区域检测的复杂目标精准识别方法。The invention relates to image processing technology, in particular to a method for accurately recognizing complex targets based on key region detection.
背景技术Background technique
复杂目标的分类与识别,是计算机视觉领域一项重要而基础的任务。不同种类的复杂目标,其大部分部位往往是相同或者相似的,而其差异往往体现在局部的一些关键区域,因此复杂目标的图像存在着大量干扰和冗余信息。而现有的一些针对复杂目标的分类识别方法,因无法去除复杂目标图像中的干扰与冗余信息,存在着精确度低的问题。为了实现对复杂目标的精准分类识别,研究一种基于关键区域检测的复杂目标精准识别方法具有重要意义。The classification and recognition of complex objects is an important and basic task in the field of computer vision. Most of the parts of different types of complex targets are the same or similar, but the differences are often reflected in some key local areas. Therefore, there are a lot of interference and redundant information in the images of complex targets. However, some existing classification and recognition methods for complex targets have the problem of low accuracy because they cannot remove the interference and redundant information in complex target images. In order to achieve accurate classification and recognition of complex targets, it is of great significance to study a method for precise recognition of complex targets based on key region detection.
发明内容Contents of the invention
有鉴于此,本发明的主要目的在于提供一种识别精准度高的基关键区域检测的复杂目标精准识别方法,在大大提高检测精度的同时,保证了识别的快速性。In view of this, the main purpose of the present invention is to provide an accurate recognition method for complex targets based on key area detection with high recognition accuracy, which ensures rapid recognition while greatly improving detection accuracy.
为了达到上述目的,本发明提出的技术方案为:一种基于关键区域检测的复杂目标精准识别方法,实现步骤如下:In order to achieve the above purpose, the technical solution proposed by the present invention is: a method for accurately identifying complex targets based on key area detection, and the implementation steps are as follows:
步骤1,读取数据库中训练样本中复杂目标图片,复杂目标关键区域的坐标标签,以及复杂目标分类标签,使用交叉训练的方法对复杂目标精准识别网络进行融合训练。Step 1. Read the images of complex targets in the training samples in the database, the coordinate labels of the key areas of complex targets, and the classification labels of complex targets, and use the cross-training method to perform fusion training on the complex target accurate recognition network.
步骤2,将待识别的复杂目标图片作为步骤1训练之后的复杂目标精准识别网络的输入,通过VGG卷积神经网络提取特征,得到待识别的复杂目标图片的特征图。In step 2, the complex target picture to be recognized is used as the input of the complex target accurate recognition network trained in step 1, and features are extracted through the VGG convolutional neural network to obtain the feature map of the complex target picture to be recognized.
步骤3,将步骤2得到的特征图输入到关键区域检测子网络中,以3×3大小的子网络在特征图上进行滑动,以锚方框作为参考,检测复杂目标图片的关键区域,给出关键区域的预测方框和是否是关键区域的可能性Pis,Pnot;Step 3, input the feature map obtained in step 2 into the key region detection subnetwork, slide the feature map with a 3×3 subnetwork, and use the anchor box as a reference to detect the key region of the complex target image, and give The prediction box of the key area and the possibility of whether it is a key area P is , P not ;
步骤4,采用非最大抑制对检测到的重叠度较高的区域进行过滤,当不同预测方框交集部分面积与并集部分面积的比例超过规定的阈值IOU_threshold时,则仅保留是关键区域可能性Pis最大的预测方框,而对其他的方框进行过滤;Step 4: Use non-maximum suppression to filter the detected areas with a high degree of overlap. When the ratio of the area of the intersection of different prediction boxes to the area of the union exceeds the specified threshold IOU_threshold, only the possibility of the key area is retained P is the largest prediction box, while filtering the other boxes;
步骤5,设定是关键区域可能性Pis的阈值P_threshold,将是关键区域可能性Pis大于设定阈值P_threshold的区域映射到VGG网络提取的特征图上;Step 5, setting the threshold P_threshold of the possibility P is of the key area, and mapping the area where the possibility P is of the key area is greater than the set threshold P_threshold to the feature map extracted by the VGG network;
步骤6,将步骤5得到的映射到特征图上的区域进行区域标准池化,把检测出的不同大小的区域池化为固定大小的特征图;Step 6, perform regional standard pooling on the region mapped to the feature map obtained in step 5, and pool the detected regions of different sizes into a fixed-size feature map;
步骤7、将步骤6得到的固定大小的特征图作为分类子网络的输入,使用分类子网络对其作精准的分类,使用softmax函数对分类结果归一化,得到对关键区域分类的概率;Step 7. Use the fixed-size feature map obtained in step 6 as the input of the classification sub-network, use the classification sub-network to classify it accurately, use the softmax function to normalize the classification results, and obtain the probability of classifying the key regions;
步骤8,针对同一张图片对应的一个复杂目标,对步骤7得到的各个关键区域的分类的对应概率取均值进行融合,得到复杂目标种类的精准识别结果。In step 8, for a complex target corresponding to the same picture, the mean value of the corresponding probabilities of the classifications of each key area obtained in step 7 is fused to obtain an accurate recognition result of the complex target type.
所述步骤1中,整个网络交叉训练的过程如下:In the step 1, the whole network cross-training process is as follows:
步骤11,使用以ImageNet数据库图片为训练样本、针对分类任务训练的VGG网络的权值作为初始权值,在此基础上进行微调;Step 11, using the ImageNet database picture as the training sample and the weight of the VGG network trained for the classification task as the initial weight, and fine-tuning on this basis;
步骤12,读取复杂目标图片和复杂目标图片对应的关键区域的坐标标签,对关键区域检测子网络进行训练,训练的损失函数为loss=LP+Lreg,其中LP为关键区域检测子网络输出的是否是关键区域的概率Pis,Pnot与标签真实值的交叉熵,Lreg为关键区域检测子网络输出的检测区域坐标偏移量与标签中实际关键区域坐标偏移量的平方和;Step 12, read the complex target picture and the coordinate label of the key area corresponding to the complex target picture, and train the key area detection sub-network, the training loss function is loss=L P +L reg , where L P is the key area detector Whether the network output is the probability of the key area P is , P not and the cross entropy of the true value of the label, L reg is the square of the coordinate offset of the detection area output by the key area detection sub-network and the actual key area coordinate offset in the label and;
步骤13,读取复杂目标图片和复杂目标图片对应的分类标签,对分类子网络进行训练,训练的损失函数为网络输出分类结果与实际标签结果之间的交叉熵;Step 13, read the complex target picture and the classification label corresponding to the complex target picture, and train the classification sub-network, the training loss function is the cross entropy between the network output classification result and the actual label result;
步骤14重复步骤12和步骤13若干次,对关键区域检测子网络和分类子网络进行交叉训练,直到网络稳定。Step 14 Repeat step 12 and step 13 several times to perform cross-training on the key region detection sub-network and classification sub-network until the network is stable.
所述步骤3中,关键区域检测的的方法如下:In the step 3, the method of key region detection is as follows:
步骤31,使用一个大小为3×3的滑动窗口,在步骤2得到的特征图上进行滑动,在每个位置得到一个512维的向量;Step 31, using a sliding window with a size of 3×3, sliding on the feature map obtained in step 2, and obtaining a 512-dimensional vector at each position;
步骤32,在每一个滑动窗口的位置设定9个锚方框作为参考,锚方框长宽比按1:2、1:1、2:1设定为三种比例,面积大小设定为1282、2562、5122像素三种大小,锚方框的中心点为所在滑动窗口的中心;Step 32, set 9 anchor boxes at the position of each sliding window as a reference, the aspect ratio of the anchor box is set to three ratios according to 1:2, 1:1, and 2:1, and the area size is set to 128 2 , 256 2 , 512 2 pixels in three sizes, the center point of the anchor box is the center of the sliding window;
步骤33,将上述每个滑动窗口位置得到的512维向量通过全连接网络输出9个6维的向量;每个向量表示相对于一个参考锚方框,检测区域的中心点坐标、长和宽的偏移量dx,dy,dl,dw和是否是关键区域可能性Pis,Pnot,其中:dx=(x-xa)/la,dy=(y-ya)/wa, dl=log(l/la),dw=log(w/wa),x,y,l,w表示检测出的区域中心点坐标、长和宽, xa,ya,la,wa表示参考锚区域中心点坐标、长和宽,Pis,Pnot使用softmax函数进行归一化处理;Step 33, output the 512-dimensional vector obtained by each sliding window position above through the fully connected network to output nine 6-dimensional vectors; each vector represents the center point coordinates, length and width of the detection area relative to a reference anchor box Whether the offset d x , d y , d l , d w is the possibility of the key region P is , P not , where: d x =(xx a )/l a , d y =(yy a )/w a , d l =log(l/l a ), d w =log(w/w a ), x,y,l,w represent the coordinates, length and width of the center point of the detected area, x a ,y a ,l a , w a represent the coordinates, length and width of the center point of the reference anchor area, P is , P not use the softmax function for normalization;
步骤34根据网络回归得到的偏移量dx,dy,dl,dw,与锚方框的中心点坐标、长和宽xa,ya,la,wa,计算出检测区域实际的中心点坐标、长和宽x,y,l,w。Step 34 Calculate the detection area according to the offsets d x , d y , d l , d w obtained from the network regression, and the center point coordinates, length and width x a , y a , l a , w a of the anchor box The actual center point coordinates, length and width x, y, l, w.
所述步骤6中,区域标准池化的方法如下:In the step 6, the method of regional standard pooling is as follows:
步骤61,把待池化的区域大小表示为m×n,将待池化的区域划分成7×7个,大小约为m/7×n/7的小格子,当m/7或n/7无法取整时,则按照四舍五入近似取整;Step 61, express the size of the area to be pooled as m×n, divide the area to be pooled into 7×7 small grids with a size of about m/7×n/7, when m/7 or n/ 7 If it cannot be rounded to an integer, it will be rounded to an approximate integer;
步骤62,在步骤61划分的每一个小格子中,使用最大池化的方法,将小格子的中的特征池化为1×1维的,这样,将不同大小的特征区域池化为7×7维固定大小的特征图。Step 62, in each small grid divided in step 61, use the method of maximum pooling to pool the features in the small grid into 1×1 dimension, so that the feature regions of different sizes are pooled into 7× 7-dimensional fixed-size feature maps.
综上所述,本发明所述的一种基于关键区域检测的复杂目标精准识别方法,包括:使用交叉训练的方法对整个神经网络进行融合训练,使用卷积神经网络提取目标特征,使用检测子网络以锚方框作为参考检测复杂目标的关键区域,使用区域标准池化将关键区域池化为固大小的特征图,使用分类子网络对关键区域进行分类,融合各个关键区域的分类结果从而达到对目标的精准识别。整个网络包括了关键区域检测子网络和关键区域分类子网络,由检测子网络检测出复杂目标具有区分度的关键区域,再由分类子网络对关键区域进行分类,融合各区域的分类结果对整体目标进行识别。这两个子网络共享了VGG卷积神经网络提取的特征,从而使复杂目标的识别达了到快速与精准的效果。To sum up, the method for accurate identification of complex targets based on key area detection described in the present invention includes: using cross-training methods to perform fusion training on the entire neural network, using convolutional neural networks to extract target features, and using detectors The network uses the anchor box as a reference to detect the key areas of complex targets, uses regional standard pooling to pool the key areas into feature maps of fixed size, uses the classification sub-network to classify the key areas, and fuses the classification results of each key area to achieve Accurate identification of the target. The entire network includes a key area detection sub-network and a key area classification sub-network. The detection sub-network detects key areas with a degree of discrimination for complex targets, and then the classification sub-network classifies the key areas. target identification. These two sub-networks share the features extracted by the VGG convolutional neural network, so that the recognition of complex targets can be achieved quickly and accurately.
本发明与现有技术相比的优点在于:The advantage of the present invention compared with prior art is:
(1)精准性:很多不同的复杂目标往往在大部分地方相似,而其不同之处往往体现在局部的关键区域。传统的目标识别方法将整张图片作为分类网络的输入,而整张图片含有大量的冗余信息和干扰信息,这限制了目标识别的精度。本方法使用检测子网络先检测出关键区域,再使用分类子网络对关键区域进行识别,融合各部分关键区域识别结果,达到目标精准识别的效果。(1) Accuracy: Many different complex targets are often similar in most places, but their differences are often reflected in local key areas. The traditional object recognition method takes the whole picture as the input of the classification network, and the whole picture contains a lot of redundant information and interference information, which limits the accuracy of target recognition. In this method, the detection sub-network is used to first detect the key area, and then the classification sub-network is used to identify the key area, and the identification results of each part of the key area are fused to achieve the effect of accurate target identification.
(2)快速性:本发明采用深度神经网络来提取原始图像的特征,检测子网络和分类子网络子网络共享同一个神经网络提取的特征。在训练过程中,采用交叉训练的方法对整个网络进行训练。在测试过程中,检测子网络和分类子网络共享同一个神经网络提取的特征,从而大大减少了网络的参数量和计算量,可以达到快速的目标识别效果。(2) Rapidity: The present invention uses a deep neural network to extract the features of the original image, and the detection sub-network and the classification sub-network share the features extracted by the same neural network. During the training process, the whole network is trained by cross-training method. During the test, the detection sub-network and the classification sub-network share the features extracted by the same neural network, which greatly reduces the amount of network parameters and calculations, and can achieve fast target recognition results.
附图说明Description of drawings
图1为本发明的实现流程示意图。Fig. 1 is a schematic diagram of the implementation flow of the present invention.
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图及具体实施例对本发明作进一步地详细描述。In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.
本发明所述的一种基于关键区域检测的复杂目标精准识别方法,包括:使用交叉训练的方法对整个神经网络进行融合训练,使用卷积神经网络提取目标特征,使用检测子网络以锚方框作为参考检测复杂目标的关键区域,使用区域标准池化将关键区域池化为固大小的特征图,使用分类子网络对关键区域进行分类,融合各个关键区域的分类结果从而达到对目标的精准识别。整个网络包括了关键区域检测子网络和关键区域分类子网络,由检测子网络检测出复杂目标具有区分度的关键区域,再由分类子网络对关键区域进行分类,融合各区域的分类结果对整体目标进行识别。这两个子网络共享了VGG卷积神经网络提取的特征,从而使复杂目标的识别达了到快速与精准的效果。A method for accurately identifying complex targets based on key region detection according to the present invention includes: using cross-training methods to perform fusion training on the entire neural network, using convolutional neural networks to extract target features, and using detection sub-networks to anchor boxes As a reference to detect key areas of complex targets, use regional standard pooling to pool key areas into fixed-size feature maps, use classification subnetworks to classify key areas, and fuse classification results of each key area to achieve accurate identification of targets . The entire network includes a key area detection sub-network and a key area classification sub-network. The detection sub-network detects key areas with a degree of discrimination for complex targets, and then the classification sub-network classifies the key areas. target identification. These two sub-networks share the features extracted by the VGG convolutional neural network, so that the recognition of complex targets can be achieved quickly and accurately.
如图1所示,本发明具体实现如下步骤:As shown in Figure 1, the present invention specifically realizes the following steps:
步骤1,读取数据库中训练样本中的复杂目标图片、复杂目标图片对应的关键区域的坐标标签、以及复杂目标图片对应的分类标签,使用交叉训练的方法对复杂目标精准识别网络进行融合训练;Step 1, read the complex target picture in the training sample in the database, the coordinate label of the key area corresponding to the complex target picture, and the classification label corresponding to the complex target picture, and use the cross-training method to perform fusion training on the complex target accurate recognition network;
步骤2,将待识别的复杂目标图片作为步骤1训练之后的复杂目标精准识别网络的输入,通过VGG卷积神经网络提取特征,得到待识别的复杂目标图片的特征图;Step 2, using the complex target picture to be recognized as the input of the complex target accurate recognition network after step 1 training, extracting features through the VGG convolutional neural network, and obtaining the feature map of the complex target picture to be recognized;
步骤3,将步骤2得到的特征图输入到关键区域检测子网络中,以3×3大小的子网络在特征图上进行滑动,以锚方框作为参考,检测复杂目标图片的关键区域,给出关键区域的预测方框和是否是关键区域的可能性Pis,Pnot;Step 3, input the feature map obtained in step 2 into the key region detection subnetwork, slide the feature map with a 3×3 subnetwork, and use the anchor box as a reference to detect the key region of the complex target image, and give The prediction box of the key area and the possibility of whether it is a key area P is , P not ;
步骤4,采用非最大抑制对检测到的重叠度较高的区域进行过滤,当不同预测方框交集部分面积与并集部分面积的比例超过规定的阈值IOU_threshold时,则仅保留是关键区域可能性Pis最大的预测方框,而对其他的方框进行过滤;Step 4: Use non-maximum suppression to filter the detected areas with a high degree of overlap. When the ratio of the area of the intersection of different prediction boxes to the area of the union exceeds the specified threshold IOU_threshold, only the possibility of the key area is retained P is the largest prediction box, while filtering the other boxes;
步骤5,设定是关键区域可能性Pis的阈值P_threshold,将是关键区域可能性Pis大于设定阈值P_threshold的区域映射到VGG网络提取的特征图上;Step 5, setting the threshold P_threshold of the possibility P is of the key area, and mapping the area where the possibility P is of the key area is greater than the set threshold P_threshold to the feature map extracted by the VGG network;
步骤6,将步骤5得到的映射到特征图上的区域进行区域标准池化,把检测出的不同大小的区域池化为固定大小的特征图;Step 6, perform regional standard pooling on the region mapped to the feature map obtained in step 5, and pool the detected regions of different sizes into a fixed-size feature map;
步骤7、将步骤6得到的固定大小的特征图作为分类子网络的输入,使用分类子网络对其作精准的分类,使用softmax函数对分类结果归一化,得到对关键区域分类的概率;Step 7. Use the fixed-size feature map obtained in step 6 as the input of the classification sub-network, use the classification sub-network to classify it accurately, use the softmax function to normalize the classification results, and obtain the probability of classifying the key regions;
步骤8,针对同一张图片对应的一个复杂目标,对步骤7得到的各个关键区域的分类的对应概率取均值进行融合,得到复杂目标种类的精准识别结果。In step 8, for a complex target corresponding to the same picture, the mean value of the corresponding probabilities of the classifications of each key area obtained in step 7 is fused to obtain an accurate recognition result of the complex target type.
所述步骤1中,整个网络交叉训练的过程如下:In the step 1, the whole network cross-training process is as follows:
步骤11,使用以ImageNet数据库图片为训练样本、针对分类任务训练的VGG网络的权值作为初始权值,在此基础上进行微调;Step 11, using the ImageNet database picture as the training sample and the weight of the VGG network trained for the classification task as the initial weight, and fine-tuning on this basis;
步骤12,读取复杂目标图片和复杂目标图片对应的关键区域的坐标标签,对关键区域检测子网络进行训练,训练的损失函数为loss=LP+Lreg,其中LP为关键区域检测子网络输出的是否是关键区域的概率Pis,Pnot与标签真实值的交叉熵,Lreg为关键区域检测子网络输出的检测区域坐标偏移量与标签中实际关键区域坐标偏移量的平方和;Step 12, read the complex target picture and the coordinate label of the key area corresponding to the complex target picture, and train the key area detection sub-network, the training loss function is loss=L P +L reg , where L P is the key area detector Whether the network output is the probability of the key area P is , P not and the cross entropy of the true value of the label, L reg is the square of the coordinate offset of the detection area output by the key area detection sub-network and the actual key area coordinate offset in the label and;
步骤13,读取复杂目标图片和复杂目标图片对应的分类标签,对分类子网络进行训练,训练的损失函数为网络输出分类结果与实际标签结果之间的交叉熵;Step 13, read the complex target picture and the classification label corresponding to the complex target picture, and train the classification sub-network, the training loss function is the cross entropy between the network output classification result and the actual label result;
步骤14重复步骤12和步骤13若干次,对关键区域检测子网络和分类子网络进行交叉训练,直到网络稳定。Step 14 Repeat step 12 and step 13 several times to perform cross-training on the key region detection sub-network and classification sub-network until the network is stable.
所述步骤3中,关键区域检测的的方法如下:In the step 3, the method of key region detection is as follows:
步骤31,使用一个大小为3×3的滑动窗口,在步骤2得到的特征图上进行滑动,在每个位置得到一个512维的向量;Step 31, using a sliding window with a size of 3×3, sliding on the feature map obtained in step 2, and obtaining a 512-dimensional vector at each position;
步骤32,在每一个滑动窗口的位置设定9个锚方框作为参考,锚方框长宽比按1:2、1:1、2:1设定为三种比例,面积大小设定为1282、2562、5122像素三种大小,锚方框的中心点为所在滑动窗口的中心;Step 32, set 9 anchor boxes at the position of each sliding window as a reference, the aspect ratio of the anchor box is set to three ratios according to 1:2, 1:1, and 2:1, and the area size is set to 128 2 , 256 2 , 512 2 pixels in three sizes, the center point of the anchor box is the center of the sliding window;
步骤33,将上述每个滑动窗口位置得到的512维向量通过全连接网络输出9个6维的向量;每个向量表示相对于一个参考锚方框,检测区域的中心点坐标、长和宽的偏移量dx,dy,dl,dw和是否是关键区域可能性Pis,Pnot,其中:dx=(x-xa)/la,dy=(y-ya)/wa, dl=log(l/la),dw=log(w/wa),x,y,l,w表示检测出的区域中心点坐标、长和宽, xa,ya,la,wa表示参考锚区域中心点坐标、长和宽,Pis,Pnot使用softmax函数进行归一化处理;Step 33, output the 512-dimensional vector obtained by each sliding window position above through the fully connected network to output nine 6-dimensional vectors; each vector represents the center point coordinates, length and width of the detection area relative to a reference anchor box Whether the offset d x , d y , d l , d w is the possibility of the key region P is , P not , where: d x =(xx a )/l a , d y =(yy a )/w a , d l =log(l/l a ), d w =log(w/w a ), x,y,l,w represent the coordinates, length and width of the center point of the detected area, x a ,y a ,l a , w a represent the coordinates, length and width of the center point of the reference anchor area, P is , P not use the softmax function for normalization;
步骤34根据网络回归得到的偏移量dx,dy,dl,dw,与锚方框的中心点坐标、长和宽xa,ya,la,wa,计算出检测区域实际的中心点坐标、长和宽x,y,l,w。Step 34 Calculate the detection area according to the offsets d x , d y , d l , d w obtained from the network regression, and the center point coordinates, length and width x a , y a , l a , w a of the anchor box The actual center point coordinates, length and width x, y, l, w.
所述步骤6中,区域标准池化的过程如下:In the step 6, the process of regional standard pooling is as follows:
步骤61,把待池化的区域大小表示为m×n,将待池化的区域划分成7×7个,大小约为m/7×n/7的小格子,当m/7或n/7无法取整时,则按照四舍五入近似取整;Step 61, express the size of the area to be pooled as m×n, divide the area to be pooled into 7×7 small grids with a size of about m/7×n/7, when m/7 or n/ 7 If it cannot be rounded to an integer, it will be rounded to an approximate integer;
步骤62,在步骤61划分的每一个小格子中,使用最大池化的方法,将小格子的中的特征池化为1×1维的,这样,将不同大小的特征区域池化为7×7维固定大小的特征图。Step 62, in each small grid divided in step 61, use the method of maximum pooling to pool the features in the small grid into 1×1 dimension, so that the feature regions of different sizes are pooled into 7× 7-dimensional fixed-size feature maps.
综上所述,以上仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。To sum up, the above are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810345899.9A CN108537286B (en) | 2018-04-18 | 2018-04-18 | An Accurate Recognition Method of Complex Targets Based on Key Area Detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810345899.9A CN108537286B (en) | 2018-04-18 | 2018-04-18 | An Accurate Recognition Method of Complex Targets Based on Key Area Detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108537286A true CN108537286A (en) | 2018-09-14 |
CN108537286B CN108537286B (en) | 2020-11-24 |
Family
ID=63481345
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810345899.9A Active CN108537286B (en) | 2018-04-18 | 2018-04-18 | An Accurate Recognition Method of Complex Targets Based on Key Area Detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108537286B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109410601A (en) * | 2018-12-04 | 2019-03-01 | 北京英泰智科技股份有限公司 | Method for controlling traffic signal lights, device, electronic equipment and storage medium |
CN109829398A (en) * | 2019-01-16 | 2019-05-31 | 北京航空航天大学 | A kind of object detection method in video based on Three dimensional convolution network |
CN110852285A (en) * | 2019-11-14 | 2020-02-28 | 腾讯科技(深圳)有限公司 | Object detection method and device, computer equipment and storage medium |
WO2020057145A1 (en) * | 2018-09-21 | 2020-03-26 | Boe Technology Group Co., Ltd. | Method and device for generating painting display sequence, and computer storage medium |
CN110929678A (en) * | 2019-12-04 | 2020-03-27 | 山东省计算中心(国家超级计算济南中心) | Method for detecting candida vulva vagina spores |
CN110955380A (en) * | 2018-09-21 | 2020-04-03 | 中科寒武纪科技股份有限公司 | Access data generation method, storage medium, computer device and apparatus |
CN111612797A (en) * | 2020-03-03 | 2020-09-01 | 江苏大学 | A rice image information processing system |
CN111931877A (en) * | 2020-10-12 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Target detection method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN107169421A (en) * | 2017-04-20 | 2017-09-15 | 华南理工大学 | A kind of car steering scene objects detection method based on depth convolutional neural networks |
CN107368845A (en) * | 2017-06-15 | 2017-11-21 | 华南理工大学 | A kind of Faster R CNN object detection methods based on optimization candidate region |
CN107798335A (en) * | 2017-08-28 | 2018-03-13 | 浙江工业大学 | A kind of automobile logo identification method for merging sliding window and Faster R CNN convolutional neural networks |
-
2018
- 2018-04-18 CN CN201810345899.9A patent/CN108537286B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN107169421A (en) * | 2017-04-20 | 2017-09-15 | 华南理工大学 | A kind of car steering scene objects detection method based on depth convolutional neural networks |
CN107368845A (en) * | 2017-06-15 | 2017-11-21 | 华南理工大学 | A kind of Faster R CNN object detection methods based on optimization candidate region |
CN107798335A (en) * | 2017-08-28 | 2018-03-13 | 浙江工业大学 | A kind of automobile logo identification method for merging sliding window and Faster R CNN convolutional neural networks |
Non-Patent Citations (6)
Title |
---|
BO ZHAO等: ""A survey on deep learning-based fine-grained object classification and semantic segmentation"", 《INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING》 * |
SHAOQING REN等: ""Faster r-cnn: Towards real-time object detection with region proposal networks"", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
XIANGTENG HE等: ""Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN"", 《MM’17 PROCEEDINGS OF THE 25TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》 * |
YINGFENG CAI等: ""Scene-Adaptive Vehicle Detection Algorithm Based on a Composite Deep Structure"", 《IEEE ACCESS》 * |
吴凡: ""基于深度学习的车型细粒度识别研究"", 《HTTP://WWW.DOC88.COM/P-7708621280922.HTML》 * |
李新叶等: ""基于卷积神经网络语义检测的细粒度鸟类识别"", 《科学技术与工程》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110955380A (en) * | 2018-09-21 | 2020-04-03 | 中科寒武纪科技股份有限公司 | Access data generation method, storage medium, computer device and apparatus |
WO2020057145A1 (en) * | 2018-09-21 | 2020-03-26 | Boe Technology Group Co., Ltd. | Method and device for generating painting display sequence, and computer storage medium |
CN109410601A (en) * | 2018-12-04 | 2019-03-01 | 北京英泰智科技股份有限公司 | Method for controlling traffic signal lights, device, electronic equipment and storage medium |
CN109829398A (en) * | 2019-01-16 | 2019-05-31 | 北京航空航天大学 | A kind of object detection method in video based on Three dimensional convolution network |
CN109829398B (en) * | 2019-01-16 | 2020-03-31 | 北京航空航天大学 | A method for object detection in video based on 3D convolutional network |
CN110852285A (en) * | 2019-11-14 | 2020-02-28 | 腾讯科技(深圳)有限公司 | Object detection method and device, computer equipment and storage medium |
CN110852285B (en) * | 2019-11-14 | 2023-04-18 | 腾讯科技(深圳)有限公司 | Object detection method and device, computer equipment and storage medium |
CN110929678A (en) * | 2019-12-04 | 2020-03-27 | 山东省计算中心(国家超级计算济南中心) | Method for detecting candida vulva vagina spores |
CN110929678B (en) * | 2019-12-04 | 2023-04-25 | 山东省计算中心(国家超级计算济南中心) | Method for detecting vulvovaginal candida spores |
CN111612797A (en) * | 2020-03-03 | 2020-09-01 | 江苏大学 | A rice image information processing system |
CN111612797B (en) * | 2020-03-03 | 2021-05-25 | 江苏大学 | Rice image information processing system |
CN111931877A (en) * | 2020-10-12 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Target detection method, device, equipment and storage medium |
CN111931877B (en) * | 2020-10-12 | 2021-01-05 | 腾讯科技(深圳)有限公司 | Target detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108537286B (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108537286A (en) | A kind of accurate recognition methods of complex target based on key area detection | |
WO2017190574A1 (en) | Fast pedestrian detection method based on aggregation channel features | |
Sirmacek et al. | Urban-area and building detection using SIFT keypoints and graph theory | |
CN107506763B (en) | An accurate positioning method of multi-scale license plate based on convolutional neural network | |
İlsever et al. | Two-dimensional change detection methods: remote sensing applications | |
CN110210475B (en) | A non-binarization and edge detection method for license plate character image segmentation | |
CN109061600B (en) | A Target Recognition Method Based on Millimeter Wave Radar Data | |
Yan et al. | Detection and classification of pole-like road objects from mobile LiDAR data in motorway environment | |
CN111274926B (en) | Image data screening method, device, computer equipment and storage medium | |
CN107480620B (en) | Remote sensing image automatic target identification method based on heterogeneous feature fusion | |
CN104182985B (en) | Remote sensing image change detection method | |
Zhang et al. | Road recognition from remote sensing imagery using incremental learning | |
CN108629286B (en) | Remote sensing airport target detection method based on subjective perception significance model | |
CN108492298B (en) | Multispectral image change detection method based on generation countermeasure network | |
CN109784392A (en) | A kind of high spectrum image semisupervised classification method based on comprehensive confidence | |
KR102757154B1 (en) | Clothes defect detection algorithm using CNN image processing | |
CN105976376B (en) | A target detection method for high-resolution SAR images based on component model | |
CN111898627B (en) | A PCA-based SVM Cloud Particle Optimal Classification and Recognition Method | |
Yuan et al. | Learning to count buildings in diverse aerial scenes | |
CN112819753B (en) | Building change detection method and device, intelligent terminal and storage medium | |
CN113158954B (en) | Automatic detection method for zebra crossing region based on AI technology in traffic offsite | |
CN106056139A (en) | Forest fire smoke/fog detection method based on image segmentation | |
KR101742115B1 (en) | An inlier selection and redundant removal method for building recognition of multi-view images | |
CN109934216A (en) | Image processing method, apparatus, and computer-readable storage medium | |
Indrabayu et al. | Blob modification in counting vehicles using gaussian mixture models under heavy traffic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |