CN102867195B - Method for detecting and identifying a plurality of types of objects in remote sensing image - Google Patents
Method for detecting and identifying a plurality of types of objects in remote sensing image Download PDFInfo
- Publication number
- CN102867195B CN102867195B CN201210300645.8A CN201210300645A CN102867195B CN 102867195 B CN102867195 B CN 102867195B CN 201210300645 A CN201210300645 A CN 201210300645A CN 102867195 B CN102867195 B CN 102867195B
- Authority
- CN
- China
- Prior art keywords
- image
- sub
- reconstruction error
- matrix
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000001514 detection method Methods 0.000 claims abstract description 34
- 238000012360 testing method Methods 0.000 claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims description 68
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 9
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 abstract description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
本发明涉及一种基于稀疏表示字典学习的遥感图像多类目标检测和识别的方法。技术特征在于:首先对预处理后的训练数据使用基于稀疏表示字典训练方法训练出字典;然后对测试图像中的子图像块使用训练所得到的字典进行稀疏编码,求出其稀疏表示系数进而得出子图像块的重构误差,通过对重构误差的阈值化处理,确定候选目标区域;最后经过后期处理实现对遥感图像多类目标的精确检测和识别。利用本发明方法,可以从复杂背景下的遥感图像中检测并识别出多种类型的目标。本发明具有较高的检测识别精度和较低的虚警率。
The invention relates to a method for detecting and recognizing multiple types of targets in remote sensing images based on sparse representation dictionary learning. The technical features are as follows: first, a dictionary is trained on the preprocessed training data using a training method based on a sparse representation dictionary; then, the sub-image block in the test image is sparsely coded using the dictionary obtained through training, and its sparse representation coefficient is obtained to obtain The reconstruction error of the sub-image block is obtained, and the candidate target area is determined by thresholding the reconstruction error; finally, the accurate detection and recognition of multiple types of targets in the remote sensing image is realized through post-processing. By using the method of the invention, multiple types of targets can be detected and recognized from remote sensing images under complex backgrounds. The invention has higher detection and recognition precision and lower false alarm rate.
Description
技术领域 technical field
本发明涉及一种遥感图像多类目标检测和识别方法,可以应用于复杂背景遥感图像下的多种类型目标检测和识别。The invention relates to a method for detecting and identifying multiple types of targets in remote sensing images, which can be applied to detecting and identifying multiple types of targets under complex background remote sensing images.
背景技术 Background technique
作为遥感图像处理技术的一个应用,复杂背景遥感图像下的目标检测与识别是军事侦察和精确打击等领域的一项关键技术,也一直是该领域的研究热点和难点,有着重要的军事和民用价值,受到人们越来越多的关注。As an application of remote sensing image processing technology, target detection and recognition under complex background remote sensing images is a key technology in the fields of military reconnaissance and precision strikes, and has always been a research hotspot and difficulty in this field, with important military and civilian applications. value has received more and more attention.
目前遥感图像目标检测主要有两种方法。一种是在遥感图像中通过检测目标所具有的某些形状、几何特征来解决目标检测问题,但是由于遥感图像背景复杂,存在着大量和目标相似的形状、几何特征,仅仅依靠这些特征来检测目标会出现大量的漏检、误检。另一种是基于分类的思想,其中最常见的是Bag-of-Words(BoW)分类方法,该方法首先是对图像提取SIFT特征并聚类,将聚类中心作为图像空间中的一组标准基(标准的图像区域),然后可以用这组标准基对图像进行向量表示,最后将所得到的向量通过使用SVM分类器进行分类并阈值化,得到检测结果;但是BoW方法,虽然提取的SIFT特征具有尺度和旋转不变性,但仅仅利用了特征区域的统计特征,而忽略了特征区域的空间信息,因此使用BoW的方法检测率低,虚警率高;而另外一种分类方法Linear Spatial Pyramid Matching Using Sparse Coding(ScSPM)虽然考虑到了特征区域的空间信息,但是所得到的用于分类的向量维数过高,运算量过大。另外,目前大多数基于分类的目标检测方法也仅限于对单一目标进行检测,不能同时对多个目标进行检测与识别。At present, there are mainly two methods for object detection in remote sensing images. One is to solve the target detection problem by detecting certain shape and geometric features of the target in the remote sensing image, but due to the complex background of the remote sensing image, there are a large number of similar shape and geometric features to the target, and only rely on these features to detect There will be a large number of missed detections and false detections in the target. The other is based on the idea of classification, the most common of which is the Bag-of-Words (BoW) classification method, which first extracts SIFT features from the image and clusters them, using the cluster center as a set of criteria in the image space base (standard image area), then you can use this set of standard bases to represent the image as a vector, and finally use the SVM classifier to classify and threshold the obtained vector to obtain the detection result; but the BoW method, although the extracted SIFT The feature has scale and rotation invariance, but only uses the statistical characteristics of the feature area, while ignoring the spatial information of the feature area, so the method using BoW has a low detection rate and a high false alarm rate; another classification method Linear Spatial Pyramid Although Matching Using Sparse Coding (ScSPM) takes into account the spatial information of the feature area, the dimension of the obtained vector for classification is too high and the amount of calculation is too large. In addition, most current classification-based target detection methods are limited to single target detection, and cannot detect and recognize multiple targets at the same time.
发明内容 Contents of the invention
要解决的技术问题technical problem to be solved
为了避免现有技术的不足之处,本发明提出了一种基于稀疏表示字典学习的遥感图像多类目标检测和识别的方法。这种方法可以自动地从复杂背景的遥感图像中检测并识别出不同类型的目标,具有较高的检测精度和较低的虚警率。In order to avoid the deficiencies of the prior art, the present invention proposes a method for detecting and recognizing multi-category objects in remote sensing images based on sparse representation dictionary learning. This method can automatically detect and recognize different types of targets from remote sensing images with complex backgrounds, and has high detection accuracy and low false alarm rate.
技术方案Technical solutions
一种遥感图像多类目标检测和识别方法,其特征在于步骤如下:A remote sensing image multi-class target detection and recognition method is characterized in that the steps are as follows:
步骤1:使用基于稀疏表示字典学习的方法训练字典,具体步骤如下:Step 1: Train the dictionary using the method based on sparse representation dictionary learning, the specific steps are as follows:
步骤a1训练图像前期处理:首先将原始图像中的同类别目标统一到一个主方向,然后将统一方向后的图像沿着0°到360°、按照步长旋转为个不同方向的图像;将不同类别目标的原始图像都按照上述方法处理,得到类训练图像,其中p为所要检测的不同类别目标数,为旋转角度,c是所得到的训练图像中不同目标不同方向图像的类别总个数;其中:为向下取整;Step a1 Pre-processing of training images: first unify the objects of the same category in the original image into one main direction, and then unify the direction of the image along 0° to 360°, according to the step size rotate as images in different directions; the original images of different types of targets are processed according to the above method, and the obtained Class training images, where p is the number of different categories of targets to be detected, is the rotation angle, c is the total number of categories of images in different directions for different targets in the obtained training image; where: is rounded down;
步骤b1数据预处理:采用加权平均法对类训练图像的RGB三个分量进行加权平均得到灰度图像,然后对灰度图像进行下采样处理,得到n×n大小的图像;对n×n大小的图像进行能量归一化处理得到归一化图像,再将归一化图像转换为n2×1维的列向量,将列向量作为训练数据中的一列,得到预处理后的训练数据集U=[U1,U2,…,Uc],其中Ui是训练数据集U中对应第i类的子数据集,i=1,2,…,c;Step b1 data preprocessing: use the weighted average method to The weighted average of the RGB three components of the class training image is to obtain a grayscale image, and then the grayscale image is down-sampled to obtain an n×n size image; the energy normalization process is performed on the n×n size image to obtain a normalized Convert the normalized image to n 2 ×1-dimensional column vector, and use the column vector as a column in the training data to obtain the preprocessed training data set U=[U 1 ,U 2 ,…,U c ], where U i is the sub-data set corresponding to the i-th class in the training data set U, i=1,2,...,c;
步骤c1训练字典:通过Fisher Discrimination Dictionary Learning for SparseRepresentation发布的FDDL软件包训练已知训练数据集U=[U1,U2,...,Uc],得到字典D=[D1,D2,…,Dc],其中,Di是与第i类相对应的子字典;Step c1 training dictionary: use the FDDL software package released by Fisher Discrimination Dictionary Learning for SparseRepresentation to train the known training data set U=[U 1 , U 2 ,...,U c ] to obtain the dictionary D=[D 1 , D 2 ,...,D c ], where D i is the sub-dictionary corresponding to the i-th class;
步骤2稀疏编码:根据训练所得到的字典D=[D1,D2,...,Dc],对测试图像中的每个子图像块进行稀疏编码,求出每个子图像块对应的稀疏系数,具体处理步骤如下:Step 2 Sparse coding: according to the dictionary D=[D 1 , D 2 , ..., D c ] obtained from training, perform sparse coding on each sub-image block in the test image, and find the corresponding sparseness of each sub-image block The specific processing steps are as follows:
步骤a2测试图像预处理:首先使用步骤b1中所述的加权平均法将测试图像转化为测试灰度图像,然后使用大小为S×S的滑动窗口沿着测试灰度图像以间隔步长b滑动得到子图像块;将子图像块下采样处理到大小为n×n的图像,然后进行能量归一化处理,再将能量归一化处理后的图像转换为一个n2×1维的列向量β,用列向量β来表示通过滑动窗口所得到的子图像块的像素灰度值信息;Step a2 Test image preprocessing: first use the weighted average method described in step b1 to convert the test image into a test grayscale image, and then use a sliding window of size S×S to slide along the test grayscale image with an interval step b Obtain the sub-image block; downsample the sub-image block to an image of size n×n, then perform energy normalization processing, and then convert the image after energy normalization processing into an n 2 ×1-dimensional column vector β, using the column vector β to represent the pixel gray value information of the sub-image block obtained through the sliding window;
步骤b2稀疏编码:对每个子图像块通过优化模型Step b2 sparse coding: pass the optimization model for each sub-image block
得到对应每个子图像块的稀疏编码系数其中是与子字典Di所对应的系数向量,ε>0是容许误差,||·||1为l1范数,||·||2为l2范数;Get the sparse coding coefficient corresponding to each sub-image block in is the coefficient vector corresponding to the sub-dictionary D i , ε>0 is the allowable error, ||·|| 1 is the l 1 norm, and ||·|| 2 is the l 2 norm;
步骤c2求取重构误差:根据稀疏编码系数计算每个子图像块与每一类的重构误差ei,取e=min{ei}作为此子图像块的重构误差,并记录其所对应的类别然后根据重构误差e与预先设定的阈值τ之间的大小关系来判定此子图像块中是否包含目标:如果e<τ,说明包含目标,否则,说明此子图像块为背景;Step c2 calculates the reconstruction error: according to the sparse coding coefficient Calculate the reconstruction error e i of each sub-image block and each category, take e=min{e i } as the reconstruction error of this sub-image block, and record its corresponding category Then, according to the size relationship between the reconstruction error e and the preset threshold τ, it is determined whether the sub-image block contains the target: if e<τ, it means that the target is contained, otherwise, it means that the sub-image block is the background;
步骤3目标检测与识别:Step 3 target detection and recognition:
步骤a3:将步骤c2中判定包含目标的每个子图像块所对应的重构误差e,组成一个与测试灰度图像大小一致的、表示候选目标区域的重构误差矩阵E=(est)P×Q;其中,est为重构误差矩阵在坐标点(s,t)处的值,
将步骤c2中判定包含目标的每个子图像块所对应的类别C,组成一个与测试灰度图像大小一致的、表示候选目标类别的类别矩阵L=(Cst)P×Q;其中Cst为类别矩阵在坐标点(s,t)处的值,
步骤b3:改变滑窗S×S的大小G次,重复步骤2~步骤a3G次,得到的G个重构误差矩阵和G个类别矩阵,G的取值范围为5~10;将得到的G个重构误差矩阵组成一个多尺度重构误差矩阵MAP=(estg)P×Q×G;其中,estg为矩阵MAP中的元素,其值为第g次改变滑窗大小得到的重构误差矩阵所对应的est,P×Q×G为多尺度重构误差矩阵的大小,g=1,2,…G;Step b3: Change the size of the sliding window S×S G times, repeat steps 2 to 3G times to obtain G reconstruction error matrices and G category matrices, and the value range of G is 5~10; the obtained G Reconstruction error matrices form a multi-scale reconstruction error matrix MAP=( estg ) P×Q×G ; among them, e stg is an element in the matrix MAP, and its value is the reconstruction obtained by changing the size of the sliding window for the gth time The est corresponding to the error matrix, P×Q×G is the size of the multi-scale reconstruction error matrix, g=1,2,...G;
将得到的G个类别矩阵构成一个多尺度类别矩阵CLASS=(Cstg)P×Q×G;其中,Cstg为矩阵CLASS中的元素,其值为第g次改变滑窗大小得到的类别矩阵所对应的Cst;根据多尺度重构误差矩阵MAP得到一个最小重构误差矩阵(map(s,t))P×Q,其中map(s,t)为对应最小重构误差矩阵在坐标点(s,t)处的值, The obtained G category matrices constitute a multi-scale category matrix CLASS=(C stg ) P×Q×G ; where, C stg is an element in the matrix CLASS, and its value is the category matrix obtained by changing the size of the sliding window for the gth time The corresponding C st ; According to the multi-scale reconstruction error matrix MAP, a minimum reconstruction error matrix (map(s,t)) P×Q is obtained, where map(s,t) is the corresponding minimum reconstruction error matrix at the coordinate point the value at (s,t),
然后求出对应最小重构误差矩阵的最小类别矩阵(class(s,t))P×Q,其中class(s,t)为最小类别矩阵在坐标点(s,t)处的值, Then find the minimum class matrix (class(s, t)) P×Q corresponding to the minimum reconstruction error matrix, where class(s, t) is the value of the minimum class matrix at the coordinate point (s, t),
根据多尺度重构误差矩阵MAP求出尺度矩阵scale=(scale(s,t))P×Q,scale(s,t)为对应尺度矩阵在坐标点(r,t)处的值,
步骤c3:求取最小重构误差矩阵(map(s,t))P×Q的局部邻域极小值作为检测到的目标响应值,局部邻域极小值在最小重构误差矩阵(map(s,t))P×Q中所对应的坐标即为目标的中心位置,根据中心位置在(class(s,t))P×Q和(scale(s,t))P×Q中对应的位置找到目标所对应的类别及尺度大小。Step c3: Calculate the local neighborhood minima of the minimum reconstruction error matrix (map(s,t)) P×Q as the detected target response value, and the local neighborhood minima are in the minimum reconstruction error matrix (map (s,t)) P×Q corresponds to the center position of the target, according to the center position in (class(s,t)) P×Q and (scale(s,t)) P×Q corresponding Find the category and scale size corresponding to the target.
所述加权平均法计算公式为f(x,y)=0.3R(x,y)+0.59G(x,y)+0.11B(x,y),式中,f(x,y)为加权平均法得到的灰度图像在像素点(x,y)的灰度值,R(x,y)、G(x,y)和B(x,y)分别为输入的训练图像在像素点(x,y)的RGB三个分量值。The calculation formula of the weighted average method is f(x, y)=0.3R(x, y)+0.59G(x, y)+0.11B(x, y), in the formula, f(x, y) is weighted The gray value of the grayscale image obtained by the averaging method at the pixel point (x, y), R(x, y), G(x, y) and B(x, y) are the input training image at the pixel point ( x, y) RGB three-component value.
所述能量归一化计算公式为式中,fnorm(x,y)为f(x,y)经过能量归一化后的灰度值,u和v分别为灰度图像的行和列大小。The energy normalization calculation formula is In the formula, f norm (x, y) is the gray value of f(x, y) after energy normalization, and u and v are the row and column sizes of the gray image, respectively.
所述l1范数的计算公式为The calculation formula of the l1 norm is
式中,z是大小为M×1的向量,ξk为向量z的元素,k=1,2,…,M。In the formula, z is a vector with size M×1, ξ k is the element of vector z, k=1,2,...,M.
所述l2范数的计算公式为The calculation formula of the 12 norm is
式中,z是大小为M×1的向量,ξk为向量z的元素,k=1,2,…,M。In the formula, z is a vector with size M×1, ξ k is the element of vector z, k=1,2,...,M.
所述重构误差ei的计算公式为The calculation formula of the reconstruction error e i is
式中,γ为预先设定的权值,γ的取值范围为0~1,mi是对Yi中的每一行的元素求均值得到的均值向量;Yi为Ui经过字典D稀疏编码得到的最优编码系数。In the formula, γ is a preset weight, and the value range of γ is 0~1, m i is the mean value vector obtained by averaging the elements of each row in Y i ; Y i is that U i is sparse through dictionary D The optimal encoding coefficient obtained by encoding.
所述旋转角度取值范围为0°到90°。The rotation angle The value range is from 0° to 90°.
所述FDDL软件包参数λ1的范围是0.001~0.01,λ2的范围是0.01~0.1。The range of the FDDL software package parameter λ1 is 0.001-0.01, and the range of λ2 is 0.01-0.1.
所述S的取值范围为40~90之间的整数,b的取值范围为1~15之间的整数。The value range of S is an integer between 40 and 90, and the value range of b is an integer between 1 and 15.
所述阈值τ的取值范围为0~1。The value range of the threshold τ is 0~1.
有益效果Beneficial effect
本发明提出的一种基于稀疏表示字典学习的遥感图像多类目标检测和识别的方法,首先使用预处理后的训练数据训练出冗余字典,然后对测试图像中的子图像块使用训练所得到的字典进行稀疏编码,求出其稀疏表示系数,进而通过稀疏表示系数求出子图像块的重构误差,并对其进行阈值化处理,确定候选目标区域;最后经过一些后期处理实现对遥感图像多类目标的精确检测和识别。The present invention proposes a method for detecting and recognizing multi-category targets in remote sensing images based on sparse representation dictionary learning. First, the preprocessed training data is used to train a redundant dictionary, and then the sub-image blocks in the test image are obtained by training. The dictionary is sparsely encoded to obtain its sparse representation coefficient, and then obtain the reconstruction error of the sub-image block through the sparse representation coefficient, and perform thresholding processing on it to determine the candidate target area; finally, after some post-processing, the remote sensing image Accurate detection and recognition of multiple types of targets.
本发明可以自动地从复杂背景下的遥感图像中检测并识别出多种类别的目标。实践证明,该方法具有较高的检测与识别精度和较低的虚警率。The invention can automatically detect and recognize multiple types of targets from remote sensing images under complex backgrounds. Practice has proved that this method has high detection and recognition accuracy and low false alarm rate.
附图说明 Description of drawings
图1:本发明方法的基本流程图Fig. 1: basic flowchart of the inventive method
图2:本发明方法中的训练数据Figure 2: Training data in the inventive method
图3:本发明方法的部分检测结果Figure 3: Partial test results of the inventive method
(a)飞机目标检测结果(红色方框代表飞机目标,黄色方框为虚警)(a) Aircraft target detection results (red boxes represent aircraft targets, yellow boxes are false alarms)
(b)舰船目标检测结果(白色方框代表舰船目标)(b) Ship target detection results (the white box represents the ship target)
(c)油库目标检测结果(蓝色方框代表油库目标)(c) Oil depot target detection results (the blue box represents the oil depot target)
(d)飞机、舰船目标检测结果(d) Aircraft and ship target detection results
(e)飞机、油库目标检测结果(e) Detection results of aircraft and oil depot targets
(f)舰船、油库目标检测结果(f) Target detection results of ships and oil depots
具体实施方式 Detailed ways
现结合实施实例、附图对本发明作进一步描述:Now in conjunction with embodiment, accompanying drawing, the present invention will be further described:
用于实施的硬件环境是:Intel Pentium 2.93GHz CPU计算机、2.0GB内存,运行的软件环境是:Matlab R2011a和Windows XP。选取了100幅从Google Earth上获取的遥感图像进行多类目标检测实验,主要包括了三类目标:飞机、舰船、油库,其中,飞机目标共200个,舰船目标共120个,油库目标共420个。The hardware environment used for implementation is: Intel Pentium 2.93GHz CPU computer, 2.0GB memory, and the running software environment is: Matlab R2011a and Windows XP. 100 remote sensing images obtained from Google Earth were selected for multi-type target detection experiments, mainly including three types of targets: aircraft, ships, and oil depots. Among them, there are 200 aircraft targets, 120 ship targets, and oil depot targets. 420 in total.
本发明具体实施如下:The present invention is specifically implemented as follows:
1、训练冗余字典:使用基于稀疏表示字典学习的方法训练字典,具体过程如下:1. Training redundant dictionaries: Training dictionaries based on sparse representation dictionary learning methods, the specific process is as follows:
(1.1)训练图像前期处理:具体处理过程为:首先将原始图像中的同类别目标统一到一个主方向,然后将统一方向后的图像沿着0°到360°、每隔10°旋转一次,得到36类训练数据,将不同类别目标的原始图像都按照以上方法处理,最终得到55类训练图像,即c=55,其中,飞机共36类,舰船18类,油库1类;(1.1) Pre-processing of the training image: the specific processing process is: first unify the objects of the same category in the original image into one main direction, and then rotate the image after the unified direction along 0° to 360°, every 10°, 36 types of training data are obtained, and the original images of different types of targets are processed according to the above method, and finally 55 types of training images are obtained, that is, c=55, of which, there are 36 types of aircraft, 18 types of ships, and 1 type of oil depot;
(1.2)数据预处理:采用加权平均法对55类训练图像的RGB三个分量进行加权平均得到灰度图像,然后对灰度图像进行下采样处理,得到15×15大小的图像,对15×15大小的图像进行能量归一化处理得到归一化图像,再将归一化图像转换为255×1维的列向量,将此列向量作为训练数据中的一列,得到预处理后的训练数据集U=[U1,U2,...,Uc],其中Ui是训练数据集U中对应第i类的子数据集,i=1,2,...,c;(1.2) Data preprocessing: weighted average of the RGB three components of 55 types of training images by weighted average method to obtain a grayscale image, and then downsampled the grayscale image to obtain an image of 15×15 size, and for 15×15 Perform energy normalization processing on images with a size of 15 to obtain a normalized image, and then convert the normalized image into a 255×1-dimensional column vector, and use this column vector as a column in the training data to obtain the preprocessed training data Set U=[U 1 , U 2 ,..., U c ], where U i is the sub-data set corresponding to the i-th class in the training data set U, i=1, 2,..., c;
(1.3)通过Lei Zhang发布的FDDL软件包训练已知训练数据集U=[U1,U2,...,Uc],得到字典D=[D1,D2,…,Dc],其中,Di是与第i类相对应的子字典;软件包参数λ1=0.005,λ2=0.05;(1.3) Train the known training data set U=[U 1 ,U 2 ,...,U c ] through the FDDL software package released by Lei Zhang, and get the dictionary D=[D 1 ,D 2 ,...,D c ] , where D i is a sub-dictionary corresponding to the i-th category; software package parameters λ 1 =0.005, λ 2 =0.05;
所述Lei Zhang的FDDL软件包见论文:Meng Yang,Lei Zhang,Xiangchu Feng,DavidZhang.Fisher Discrimination Dictionary Learning for Sparse Representation[C].ICCV,2011The FDDL software package of Lei Zhang can be found in the paper: Meng Yang, Lei Zhang, Xiangchu Feng, David Zhang. Fisher Discrimination Dictionary Learning for Sparse Representation[C].ICCV, 2011
2、稀疏编码:根据训练得到的字典D=[D1,D2,...,Dc],对测试图像中的每个子图像块进行稀疏编码,求出每个子图像块对应的稀疏系数,具体处理步骤如下:2. Sparse coding: According to the dictionary D=[D 1 ,D 2 ,...,D c ] obtained from training, perform sparse coding on each sub-image block in the test image, and find the corresponding sparse coefficient of each sub-image block , the specific processing steps are as follows:
(2.1)测试图像预处理:首先使用(1.1)中所述的加权平均法将测试图像转化为测试灰度图像,然后使用大小为S×S的滑动窗口沿着测试灰度图像以间隔步长为5个像素滑动得到子图像块,S初始值取90;对使用滑动窗口所得到的每一个子图像块,将其下采样到大小为15×15的图像,然后进行能量归一化处理,再将能量归一化处理后的图像转换为一个225×1维的列向量β,用列向量β来表示通过滑动窗口所得到的子图像块的像素灰度值信息;(2.1) Test image preprocessing: first use the weighted average method described in (1.1) to convert the test image into a test grayscale image, and then use a sliding window of size S×S to step along the test grayscale image at intervals The sub-image block is obtained by sliding 5 pixels, and the initial value of S is 90; for each sub-image block obtained by using the sliding window, it is down-sampled to an image with a size of 15×15, and then energy normalization is performed. Then convert the energy-normalized image into a 225×1-dimensional column vector β, and use the column vector β to represent the pixel gray value information of the sub-image block obtained through the sliding window;
(2.2)稀疏编码:对每个子图像块通过通过优化模型:(2.2) Sparse coding: pass through the optimization model for each sub-image block:
得到对应每一个子图像块的稀疏编码系数向量其中是与子字典Di Get the sparse coding coefficient vector corresponding to each sub-image block in is the subdictionary D i
对应的系数向量,容许误差ε=0.15,||·||1为l1范数,||·||2为l2范数;;Corresponding coefficient vector, allowable error ε=0.15, ||·|| 1 is the l 1 norm, ||·|| 2 is the l 2 norm;
(2.3)求取重构误差:根据稀疏编码系数计算子图像块像块与每一类的重构误差权值γ=0.5,取e=min{ei}作为此子图像块的重构误差,并记录其所对应的类别然后根据重构误差e与预先设定的阈值τ=0.3之间的大小关系来判定此子图像块中是否包含目标:如果e<τ,说明包含目标,否则,说明此子图像块为背景;(2.3) Find the reconstruction error: according to the sparse coding coefficient Calculate the reconstruction error of the sub-image block and each class Weight γ=0.5, take e=min{e i } as the reconstruction error of this sub-image block, and record its corresponding category Then, according to the size relationship between the reconstruction error e and the preset threshold τ=0.3, it is determined whether the sub-image block contains the target: if e<τ, it means that the target is contained, otherwise, it means that the sub-image block is the background;
3、目标检测与识别:3. Target detection and recognition:
(3.1)将(2.3)中判定包含目标的每个子图像块所对应的重构误差e,组成一个与测试灰度图像大小一致的、表示候选目标区域的重构误差矩阵E=(est)P×Q;其中est为重构误差矩阵在坐标点(s,t)处的值,
(3.2)改变滑窗S×S的大小,变S=90-10×j,j=1,2,…G为改变次数,重复2、步骤(3.1)共G次,得到的G个重构误差矩阵和G个类别矩阵;将得到的G个重构误差矩阵组成一个多尺度重构误差矩阵MAP=(estg)P×Q×G;其中,estg为矩阵MAP中的元素,其值为第g次改变滑窗大小得到的重构误差矩阵所对应的est,P×Q×G为多尺度重构误差矩阵的大小,g=1,2,…G;将得到的G个类别矩阵构成一个多尺度类别矩阵CLASS=(Cstg)P×Q×G;其中,Cstg为矩阵CLASS中的元素,其值为第g次改变滑窗大小得到的类别矩阵所对应的Cst;根据多尺度重构误差矩阵MAP得到一个最小重构误差矩阵(map(s,t))P×Q,其中map(s,t)为对应最小重构误差矩阵在坐标点(s,t)处的值,然后求出对应最小重构误差矩阵的最小类别矩阵(class(s,t))P×Q,其中class(s,t)为最小类别矩阵在坐标点(s,t)处的值,根据多尺度重构误差矩阵MAP求出尺度矩阵scale(s,t)为对应尺度矩阵在坐标点(r,t)处的值,(3.2) Change the size of the sliding window S×S, change S=90-10×j, j=1,2,...G as the number of changes, repeat 2, step (3.1) for a total of G times, and get G reconstructions Error matrix and G category matrices; the obtained G reconstruction error matrices form a multi-scale reconstruction error matrix MAP=(e stg ) P×Q×G ; where, e stg is an element in the matrix MAP, and its value is the est corresponding to the reconstruction error matrix obtained by changing the size of the sliding window for the gth time, P×Q×G is the size of the multi-scale reconstruction error matrix, g=1,2,...G; the G categories that will be obtained The matrix constitutes a multi-scale category matrix CLASS=(C stg ) P×Q×G ; wherein, C stg is an element in the matrix CLASS, and its value is C st corresponding to the category matrix obtained by changing the size of the sliding window for the gth time; According to the multi-scale reconstruction error matrix MAP, a minimum reconstruction error matrix (map(s,t)) P×Q is obtained, where map(s,t) is the corresponding minimum reconstruction error matrix at the coordinate point (s,t) the value of Then find the minimum class matrix (class(s, t)) P×Q corresponding to the minimum reconstruction error matrix, where class(s, t) is the value of the minimum class matrix at the coordinate point (s, t), Calculate the scale matrix according to the multi-scale reconstruction error matrix MAP scale(s,t) is the value of the corresponding scale matrix at the coordinate point (r,t),
(3.3):求取最小重构误差矩阵(map(s,t))P×Q的局部邻域极小值作为检测到的目标响应值,局部邻域极小值在最小重构误差矩阵(map(s,t))P×Q中所对应的坐标即为目标的中心位置,根据中心位置便可在(class(s,t))P×Q和(scale(s,t))P×Q中对应的位置找到目标所对应的类别及尺度大小。(3.3): Find the minimum reconstruction error matrix (map(s,t)) P×Q local neighborhood minimum value as the detected target response value, the local neighborhood minimum value is in the minimum reconstruction error matrix ( The corresponding coordinates in map(s,t)) P×Q are the center position of the target. According to the center position, the coordinates in (class(s,t)) P×Q and (scale(s,t)) P× The corresponding position in Q finds the category and scale size corresponding to the target.
所述加权平均法计算公式为The formula for calculating the weighted average method is
f(x,y)=0.3R(x,y)+0.59G(x,y)+0.11B(x,y)f(x,y)=0.3R(x,y)+0.59G(x,y)+0.11B(x,y)
式中,f(x,y)为加权平均法得到的灰度图像在像素点(x,y)的灰度值,R(x,y)、G(x,y)和B(x,y)分别为输入的训练图像在像素点(x,y)的RGB三个分量值。In the formula, f(x,y) is the grayscale value of the grayscale image at the pixel point (x,y) obtained by the weighted average method, and R(x,y), G(x,y) and B(x,y) ) are the RGB three component values of the input training image at the pixel point (x, y).
所述能量归一化计算公式为The energy normalization calculation formula is
式中,fnorm(x,y)为f(x,y)经过能量归一化后的灰度值,u和v分别为灰度图像的行和列大小,u=15,v=15。In the formula, f norm (x, y) is the gray value of f(x, y) after energy normalization, u and v are the row and column sizes of the gray image, u=15, v=15.
所述l1范数的计算公式为The calculation formula of the l1 norm is
式中,z是大小为M×1的向量,ξk为向量z的元素,k=1,2,…,M。In the formula, z is a vector with size M×1, ξ k is the element of vector z, k=1,2,...,M.
所述l2范数的计算公式为The calculation formula of the 12 norm is
所述重构误差ei的计算公式为The calculation formula of the reconstruction error e i is
式中,γ为预先设定的权值,γ=0.5,mi是对Yi中的每一行的元素求均值得到的均值向量;Yi为Ui经过字典D稀疏编码得到的最优编码系数;In the formula, γ is the preset weight, γ=0.5, m i is the mean value vector obtained by averaging the elements of each row in Y i ; Y i is the optimal code obtained by sparse coding of U i through dictionary D coefficient;
选用正确检测率和虚警率对本发明的有效性进行评估。其中,正确检测率定义为正确检测的目标个数与总的目标个数之比,虚警率定义为虚警个数与正确检测的目标个数和虚警个数之和的比值。同时,将本发明所得的检测结果与基于BoW的多类目标检测算法进行了对比,对比结果如表1所示。正确检测率以及虚警率均表明了本发明方法的有效性。The effectiveness of the present invention is evaluated by selecting correct detection rate and false alarm rate. Among them, the correct detection rate is defined as the ratio of the number of correctly detected targets to the total number of targets, and the false alarm rate is defined as the ratio of the number of false alarms to the sum of the number of correctly detected targets and the number of false alarms. At the same time, the detection results obtained by the present invention are compared with the multi-class target detection algorithm based on BoW, and the comparison results are shown in Table 1. Both the correct detection rate and the false alarm rate show the validity of the method of the present invention.
表1检测结果评价Table 1 Test result evaluation
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210300645.8A CN102867195B (en) | 2012-08-22 | 2012-08-22 | Method for detecting and identifying a plurality of types of objects in remote sensing image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210300645.8A CN102867195B (en) | 2012-08-22 | 2012-08-22 | Method for detecting and identifying a plurality of types of objects in remote sensing image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102867195A CN102867195A (en) | 2013-01-09 |
CN102867195B true CN102867195B (en) | 2014-11-26 |
Family
ID=47446059
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210300645.8A Active CN102867195B (en) | 2012-08-22 | 2012-08-22 | Method for detecting and identifying a plurality of types of objects in remote sensing image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102867195B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258210B (en) * | 2013-05-27 | 2016-09-14 | 中山大学 | A kind of high-definition image classification method based on dictionary learning |
CN103632164B (en) * | 2013-11-25 | 2017-03-01 | 西北工业大学 | The volume firm state classification recognition methodss of the KNN coil image data based on KAP sample optimization |
CN104517121A (en) * | 2014-12-10 | 2015-04-15 | 中国科学院遥感与数字地球研究所 | Spatial big data dictionary learning method based on particle swarm optimization |
CN105740422B (en) * | 2016-01-29 | 2019-10-29 | 北京大学 | Pedestrian retrieval method and device |
CN106067041B (en) * | 2016-06-03 | 2019-05-31 | 河海大学 | A kind of improved multi-target detection method based on rarefaction representation |
CN107451595A (en) * | 2017-08-04 | 2017-12-08 | 河海大学 | Infrared image salient region detection method based on hybrid algorithm |
CN109190457B (en) * | 2018-07-19 | 2021-12-03 | 北京市遥感信息研究所 | Oil depot cluster target rapid detection method based on large-format remote sensing image |
CN109946076B (en) * | 2019-01-25 | 2020-04-28 | 西安交通大学 | A weighted multi-scale dictionary learning framework for planetary bearing fault identification |
CN110189328B (en) * | 2019-06-11 | 2021-02-23 | 北华航天工业学院 | Satellite remote sensing image processing system and processing method thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129573A (en) * | 2011-03-10 | 2011-07-20 | 西安电子科技大学 | SAR (Synthetic Aperture Radar) image segmentation method based on dictionary learning and sparse representation |
CN102324047A (en) * | 2011-09-05 | 2012-01-18 | 西安电子科技大学 | Hyperspectral Image Object Recognition Method Based on Sparse Kernel Coding SKR |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8374442B2 (en) * | 2008-11-19 | 2013-02-12 | Nec Laboratories America, Inc. | Linear spatial pyramid matching using sparse coding |
-
2012
- 2012-08-22 CN CN201210300645.8A patent/CN102867195B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129573A (en) * | 2011-03-10 | 2011-07-20 | 西安电子科技大学 | SAR (Synthetic Aperture Radar) image segmentation method based on dictionary learning and sparse representation |
CN102324047A (en) * | 2011-09-05 | 2012-01-18 | 西安电子科技大学 | Hyperspectral Image Object Recognition Method Based on Sparse Kernel Coding SKR |
Non-Patent Citations (1)
Title |
---|
梁天一 等.基于稀疏编码的图像语义分类器模型.《华东理工大学学报(自然科学报)》.2007,第33卷(第6期),827-892. * |
Also Published As
Publication number | Publication date |
---|---|
CN102867195A (en) | 2013-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102867195B (en) | Method for detecting and identifying a plurality of types of objects in remote sensing image | |
CN107346436B (en) | Visual saliency detection method fusing image classification | |
CN103310195B (en) | Based on LLC feature the Weakly supervised recognition methods of vehicle high score remote sensing images | |
CN105389550B (en) | It is a kind of based on sparse guide and the remote sensing target detection method that significantly drives | |
CN102722712B (en) | Multiple-scale high-resolution image object detection method based on continuity | |
CN101980250B (en) | Method for identifying target based on dimension reduction local feature descriptor and hidden conditional random field | |
CN106845341B (en) | Unlicensed vehicle identification method based on virtual number plate | |
CN106650731B (en) | A Robust License Plate and Vehicle Logo Recognition Method | |
CN105528595A (en) | Method for identifying and positioning power transmission line insulators in unmanned aerial vehicle aerial images | |
CN104657717B (en) | A kind of pedestrian detection method based on layering nuclear sparse expression | |
CN105956560A (en) | Vehicle model identification method based on pooling multi-scale depth convolution characteristics | |
Beyan et al. | Detecting abnormal fish trajectories using clustered and labeled data | |
CN110610165A (en) | A Ship Behavior Analysis Method Based on YOLO Model | |
CN104778457A (en) | Video face identification algorithm on basis of multi-instance learning | |
Yao et al. | R²IPoints: Pursuing Rotation-Insensitive Point Representation for Aerial Object Detection | |
CN104751475B (en) | A kind of characteristic point Optimum Matching method towards still image Object identifying | |
CN105224937A (en) | Based on the semantic color pedestrian of the fine granularity heavily recognition methods of human part position constraint | |
Su et al. | FSRDD: An efficient few-shot detector for rare city road damage detection | |
CN105574489A (en) | Layered stack based violent group behavior detection method | |
CN109002463A (en) | A kind of Method for text detection based on depth measure model | |
CN103617413A (en) | Method for identifying object in image | |
CN104732248A (en) | Human body target detection method based on Omega shape features | |
CN112784722A (en) | Behavior identification method based on YOLOv3 and bag-of-words model | |
CN103218823B (en) | Based on the method for detecting change of remote sensing image that core is propagated | |
CN102609715B (en) | Object type identification method combining plurality of interest point testers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |