CN106909924A - A kind of remote sensing image method for quickly retrieving based on depth conspicuousness - Google Patents
A kind of remote sensing image method for quickly retrieving based on depth conspicuousness Download PDFInfo
- Publication number
- CN106909924A CN106909924A CN201710087670.5A CN201710087670A CN106909924A CN 106909924 A CN106909924 A CN 106909924A CN 201710087670 A CN201710087670 A CN 201710087670A CN 106909924 A CN106909924 A CN 106909924A
- Authority
- CN
- China
- Prior art keywords
- image
- network
- task
- saliency
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 60
- 230000011218 segmentation Effects 0.000 claims abstract description 55
- 238000001514 detection method Methods 0.000 claims abstract description 44
- 230000008569 process Effects 0.000 claims abstract description 32
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 19
- 230000006870 function Effects 0.000 claims description 38
- 238000013528 artificial neural network Methods 0.000 claims description 26
- 239000013598 vector Substances 0.000 claims description 23
- 210000002569 neuron Anatomy 0.000 claims description 15
- 230000004913 activation Effects 0.000 claims description 10
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 10
- 238000013135 deep learning Methods 0.000 abstract description 7
- 238000011160 research Methods 0.000 abstract description 5
- 229910002056 binary alloy Inorganic materials 0.000 abstract 1
- 238000005259 measurement Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 3
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 2
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/05—Underwater scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明以遥感影像为研究对象,利用人工智能领域的最新研究成果——深度学习技术,研究了一种遥感影像的快速检索方法。首先采用全卷积神经网络构建多任务显著性目标检测模型,计算遥感影像的深度显著性特征;然后改进深度网络结构,加入哈希层学习得到二进制哈希码;最后综合利用显著性特征和哈希码实现遥感影像准确、快速检索。本发明属于计算机视觉领域,具体涉及深度学习、显著性目标检测和图像检索等技术。The present invention takes remote sensing images as the research object, and uses the latest research results in the field of artificial intelligence - deep learning technology to study a fast retrieval method for remote sensing images. Firstly, the fully convolutional neural network is used to build a multi-task salient target detection model, and the depth salient features of remote sensing images are calculated; then the deep network structure is improved, and the hash layer is added to learn the binary hash code; finally, the salient features and the hash are comprehensively used Xima realizes accurate and fast retrieval of remote sensing images. The invention belongs to the field of computer vision, and specifically relates to technologies such as deep learning, salient target detection and image retrieval.
背景技术Background technique
遥感影像数据作为地理信息系统(Geographic Information System,GIS)、全球定位系统(Global Positioning System,GPS)、遥感测绘技术(remote sensing system,RS)三大空间信息技术中的基础数据,广泛应用于环境监测、资源调查、土地利用、城市规划、自然灾害分析和军事等各个领域。近年来,随着高分辨率遥感卫星、成像雷达以及无人机驾驶飞机(Unmanned Aerial Vehicle)技术的发展,遥感影像数据进一步呈现海量、复杂和高分辨率的特点,实现遥感影像高效、准确检索对于促进遥感影像信息的准确提取和数据共享具有重要的研究意义和应用价值。Remote sensing image data, as the basic data in the three major spatial information technologies of geographic information system (Geographic Information System, GIS), global positioning system (Global Positioning System, GPS) and remote sensing mapping technology (remote sensing system, RS), is widely used in the environment Monitoring, resource surveys, land use, urban planning, natural disaster analysis, and military. In recent years, with the development of high-resolution remote sensing satellites, imaging radars, and unmanned aerial vehicle (Unmanned Aerial Vehicle) technologies, remote sensing image data has further presented the characteristics of massive, complex, and high-resolution images, enabling efficient and accurate retrieval of remote sensing images. It has important research significance and application value for promoting the accurate extraction of remote sensing image information and data sharing.
图像检索技术由早期的基于文本的图像检索(Text-Based Image Retrieval,TBIR)逐渐发展为通过提取图像特征实现基于内容的图像检索(Content-Based ImageRetrieval,CBIR)。基于显著性目标的图像检索方法,能够快速地从复杂场景中选择少数几个显著的区域进行优先处理,从而有效降低数据处理复杂度,提高检索效率。相比普通图像检索,遥感影像包含的信息复杂多变,目标小且与背景区分不明显,如果仍采用传统的显著性检测方法将难以实现对遥感影像显著性特征的准确描述与分析。近年来,随着人工智能领域的最新研究成果——深度学习技术的提出,例如:以全卷积神经网络(FullyConvolutional Neural Network,FCNN)为代表的深度神经网络,凭借其独特的类似于人眼局部感受的卷积核以及类似于生物神经的层次级联结构,在图像深度显著性特征学习方面表现出优良的鲁棒性。其权值共享的特性也使得网络参数大大减少,同时降低了对训练数据过拟合的风险,比其他种类的深度网络更易于训练,可以提高显著性特征的表征准确度。Image retrieval technology has gradually developed from the early text-based image retrieval (Text-Based Image Retrieval, TBIR) to content-based image retrieval (Content-Based Image Retrieval, CBIR) by extracting image features. The image retrieval method based on salient objects can quickly select a few salient regions from complex scenes for priority processing, thereby effectively reducing the complexity of data processing and improving retrieval efficiency. Compared with ordinary image retrieval, the information contained in remote sensing images is complex and changeable, and the target is small and not clearly distinguished from the background. If traditional saliency detection methods are still used, it will be difficult to accurately describe and analyze the salient features of remote sensing images. In recent years, with the latest research results in the field of artificial intelligence - the introduction of deep learning technology, such as: the deep neural network represented by the fully convolutional neural network (Fully Convolutional Neural Network, FCNN), with its unique similar to the human eye The convolution kernel of local perception and the hierarchical cascade structure similar to biological nerves show excellent robustness in the learning of image depth saliency features. Its weight sharing feature also greatly reduces network parameters, and at the same time reduces the risk of overfitting to training data. It is easier to train than other types of deep networks, and can improve the representation accuracy of salient features.
考虑到遥感影像数量日益增加,图像语义描述能力有限等问题,本发明以公开的大规模航拍图像数据集(AID)、武汉大学遥感影像数据集(WHU-RS)及谷歌地球遥感影像为数据来源,提出一种基于深度显著性的遥感影像快速检索方法。首先,构建基于全卷积神经网络(Fully Convolutional Neural Network,FCNN)的多任务显著性目标检测模型,在预训练数据集上学习遥感影像不同层次的语义信息作为深度显著性特征并转换为一维列向量。进一步微调神经网络模型,引入哈希层并增加训练样本,将该模型学习到的遥感影像高维显著性特征以二进制哈希码(Binary Hash Codes)的形式映射到低维空间,分别存储显著性特征向量和哈希码构建特征数据库。通过训练好的模型提取待查询的遥感图像显著性特征向量和哈希码,对比特征数据库,计算哈希码汉明距离(Hamming Distance)和显著性特征向量欧氏距离(Euclidean Distance)度量相似度,实现遥感影像快速检索。Considering the increasing number of remote sensing images and the limited image semantic description ability, the present invention uses the public large-scale aerial image dataset (AID), Wuhan University remote sensing image dataset (WHU-RS) and Google Earth remote sensing images as data sources , propose a fast remote sensing image retrieval method based on depth saliency. First, build a multi-task salient object detection model based on a Fully Convolutional Neural Network (FCNN), and learn semantic information at different levels of remote sensing images on the pre-trained dataset as deep salient features and convert them into one-dimensional Column vector. Further fine-tune the neural network model, introduce the hash layer and increase the training samples, map the high-dimensional salient features of remote sensing images learned by the model to the low-dimensional space in the form of binary hash codes, and store the salient features separately. Feature vectors and hash codes build a feature database. Extract the salient feature vector and hash code of the remote sensing image to be queried through the trained model, compare the feature database, calculate the hash code Hamming distance (Hamming Distance) and the salient feature vector Euclidean distance (Euclidean Distance) to measure the similarity , to achieve fast retrieval of remote sensing images.
发明内容Contents of the invention
本发明与已有的遥感影像检索方法不同,利用深度学习技术,提出一种基于深度显著性的遥感影像快速检索方法。首先,采用全卷积神经网络(FCNN)构建多任务深度显著性目标检测模型,将普通卷积神经网络(CNN)图像级别的分类进一步延伸到像素级别的分类。在大规模航拍图像数据集(AID)上预训练网络,显著性检测任务和语义分割任务共享卷积层,综合学习遥感影像的三层语义信息,有效去除特征冗余,准确提取深度显著性特征。其次,在该模型中加入哈希层,扩充武汉大学遥感影像数据集(WHU-RS)微调神经网络,利用深度神经网络通过随机梯度下降算法(Stochastic Gradient Descent,SGD)实现增量学习的优势,逐点学习二进制哈希码,实现高维显著性特征降维,既可节省存储空间又可提升检索效率。同时,相比传统需要成对输入训练样本的哈希方法,本发明所采用的方法在大规模数据集上更易扩展。将神经网络预训练和微调过程学习的显著性特征转化为一维列向量,和二进制哈希码一同构建特征数据库。最后,在图像检索阶段采用由粗到细的检索策略,综合利用二进制哈希码和显著性特征度量汉明距离和欧式距离,实现遥感影像快速、准确检索。本方法主要过程如附图1所示,可分为以下三个步骤:基于深度显著性的目标检测模型构建、神经网络预训练并加入哈希层微调和多层次深度检索。Different from the existing remote sensing image retrieval methods, the present invention uses deep learning technology to propose a remote sensing image rapid retrieval method based on depth saliency. First, a fully convolutional neural network (FCNN) is used to build a multi-task deep salient object detection model, which further extends the image-level classification of ordinary convolutional neural networks (CNN) to pixel-level classification. Pre-training the network on a large-scale aerial image dataset (AID), the saliency detection task and the semantic segmentation task share the convolutional layer, comprehensively learn the three-layer semantic information of remote sensing images, effectively remove feature redundancy, and accurately extract deep salient features . Secondly, adding a hash layer to the model, expanding the Wuhan University Remote Sensing Image Dataset (WHU-RS) to fine-tune the neural network, and using the deep neural network to realize the advantages of incremental learning through the stochastic gradient descent algorithm (Stochastic Gradient Descent, SGD), Learn binary hash code point by point to achieve dimensionality reduction of high-dimensional salient features, which can save storage space and improve retrieval efficiency. At the same time, compared with the traditional hash method that needs to input training samples in pairs, the method adopted by the present invention is easier to expand on large-scale data sets. The salient features learned by the neural network pre-training and fine-tuning process are converted into one-dimensional column vectors, and the feature database is constructed together with the binary hash code. Finally, in the image retrieval stage, a coarse-to-fine retrieval strategy is adopted, and binary hash codes and salient features are used to measure Hamming distance and Euclidean distance to achieve fast and accurate retrieval of remote sensing images. The main process of this method is shown in Figure 1, which can be divided into the following three steps: object detection model construction based on depth saliency, neural network pre-training and adding hash layer fine-tuning and multi-level deep retrieval.
(1)基于深度显著性的目标检测模型构建(1) Construction of target detection model based on depth saliency
为了有效提取图像的显著区,本发明将构建一种基于全卷积神经网络的多任务显著性目标检测模型。该模型同时进行两个任务:显著性检测和语义分割。显著性检测用于对遥感影像的深度特征学习,计算深度显著性,语义分割用于提取图像内部对象语义信息,消除显著图背景混淆,补充显著性目标缺失部分。In order to effectively extract the salient area of the image, the present invention will construct a multi-task salient target detection model based on a fully convolutional neural network. The model performs two tasks simultaneously: saliency detection and semantic segmentation. Saliency detection is used for deep feature learning of remote sensing images, and depth saliency is calculated. Semantic segmentation is used to extract semantic information of objects inside images, eliminate background confusion of salient images, and supplement missing parts of salient objects.
(2)神经网络预训练并加入哈希层微调(2) Neural network pre-training and adding hash layer fine-tuning
本发明选取大规模航拍图像数据集(AID)作为标准数据集预训练网络。为了使显著性目标检测模型学习的显著性特征对中国遥感影像的检索有更好的鲁棒性,在武汉大学遥感影像数据集(WHU-RS)的基础上,在谷歌地球上下载了6050幅不同光照、拍摄角度、分辨率及尺寸的中国遥感影像,将WHU-RS数据集扩充至7000幅图像用于微调神经网络。The present invention selects a large-scale aerial image data set (AID) as a standard data set pre-training network. In order to make the salient features learned by the salient object detection model more robust to the retrieval of Chinese remote sensing images, 6050 images were downloaded from Google Earth on the basis of the Wuhan University Remote Sensing Image Dataset (WHU-RS). The WHU-RS data set was expanded to 7000 images for fine-tuning the neural network with different illumination, shooting angles, resolutions and sizes of Chinese remote sensing images.
(3)多层次深度检索(3) Multi-level deep search
本发明提出了一种由粗糙到精细的检索方案。粗糙检索利用哈希层学习的二进制哈希码,通过汉明距离度量相似性。精细检索将第13、15层卷积层生成的二维遥感影像特征图映射为一维列向量,作为显著性特征向量,通过欧氏距离度量相似性。使用基于排名的评价标准,统计检索结果的查准率(Precision)。The invention proposes a retrieval scheme from rough to fine. Rough retrieval utilizes the binary hash code learned by the hash layer, and measures the similarity by Hamming distance. Fine retrieval maps the two-dimensional remote sensing image feature maps generated by the 13th and 15th convolutional layers into a one-dimensional column vector, which is used as a salient feature vector, and the similarity is measured by Euclidean distance. Use ranking-based evaluation criteria to count the precision of the retrieval results (Precision).
1.一种基于深度显著性的遥感影像快速检索方法,其特征在于包括以下步骤:1. A remote sensing image fast retrieval method based on depth saliency, characterized in that it comprises the following steps:
步骤1:基于深度显著性的目标检测模型构建Step 1: Construction of object detection model based on deep saliency
输入一幅RGB图像,经过15个卷积层进行一系列卷积操作,然后进行显著性检测任务和超像素目标语义分割任务共享卷积层;前13个卷积层经过卷积神经网络VGGNet初化,卷积核大小为3×3,每个卷积层后采用修正线性单元ReLU作为激活函数;第2、4、5、13卷积层后进行最大值池化操作;第14、15卷积层的卷积核大小分别为7×7和1×1,第14、15卷积层后连接Dropout层;Input an RGB image, perform a series of convolution operations through 15 convolutional layers, and then perform the saliency detection task and the superpixel target semantic segmentation task to share the convolutional layer; the first 13 convolutional layers pass through the initial convolutional neural network VGGNet The size of the convolution kernel is 3×3, and the modified linear unit ReLU is used as the activation function after each convolution layer; the maximum pooling operation is performed after the 2nd, 4th, 5th, and 13th convolutional layers; volumes 14 and 15 The convolution kernel sizes of the product layers are 7×7 and 1×1 respectively, and the Dropout layer is connected after the 14th and 15th convolutional layers;
通过上采样构建反卷积层,通过双线性插值初始化其参数,在训练学习上采样函数中迭代更新;在显著性目标检测任务中通过sigmoid阈值函数将输出图像标准化至[0,1],学习显著性特征;在语义分割任务中用反卷积层对最后一个卷积层的特征图进行上采样,并且将上采样的结果进行剪裁,使输出图像与输入图像大小相同;Construct the deconvolution layer by upsampling, initialize its parameters by bilinear interpolation, and update iteratively in the training and learning upsampling function; in the salient target detection task, the output image is normalized to [0,1] through the sigmoid threshold function, Learn salient features; in the semantic segmentation task, use the deconvolution layer to upsample the feature map of the last convolutional layer, and clip the upsampling result so that the output image is the same size as the input image;
步骤2:神经网络预训练并加入哈希层微调Step 2: Neural network pre-training and adding hash layer fine-tuning
步骤2.1:多任务显著性目标检测模型预训练Step 2.1: Multi-task salient object detection model pre-training
FCNN预训练通过显著性检测任务和分割任务一同展开;χ表示N1幅宽高分别为W和Q的训练图像的集合,Xi为其中第i幅图像,Yijk表示第i幅宽高分别为j和k的图像相应的像素级真实分割图,其中i=1…N1,j=1…W,k=1…Q;Z表示N2幅训练图像的集合,Zn为其中第n幅图像,n=1…N2,它有相应的存在显著性目标的真实二值图像Mn;θs为共享卷积层参数,θh为分割任务参数,θf为显著性任务参数;公式(1)、公式(2)分别为分割任务的交叉熵代价函数J1(χ;θs,θh)和显著性检测任务的平方欧式距离代价函数J2(Z;θs,θf),FCNN通过最小化两个代价函数进行训练:The FCNN pre-training is carried out through the saliency detection task and the segmentation task together; χ represents the set of N 1 training images whose width and height are W and Q respectively, Xi is the i-th image, and Y ijk represents the i-th width and height respectively The pixel-level real segmentation map corresponding to the images of j and k, where i=1...N 1 , j=1...W, k=1...Q; Z represents the set of N 2 training images, and Z n is the nth one Image, n=1...N 2 , which has a corresponding real binary image M n with salient objects; θ s is the parameter of the shared convolution layer, θ h is the parameter of the segmentation task, θ f is the parameter of the saliency task; the formula (1) and formula (2) are respectively the cross-entropy cost function J 1 (χ; θ s , θ h ) of the segmentation task and the square Euclidean distance cost function J 2 (Z; θ s , θ f ) of the saliency detection task , the FCNN is trained by minimizing two cost functions:
公式(1)中,是指示函数,hcjk是第c类置信分割图的元素(j,k),c=1…C,h(Xi;θs,θh)是语义分割函数,共返回C个目标类的置信分割图,C为预训练数据集包含的图像类别公式(2)中,f(Zn;θs,θf)是显著图输出函数,F表示F-范数运算;In formula (1), is the indicator function, h cjk is the element (j,k) of the c-th confidence segmentation map, c=1...C, h(Xi; θ s , θ h ) is the semantic segmentation function, and returns the confidence of C target classes Segmentation map, C is the image category contained in the pre-training data set In the formula (2), f(Z n ; θ s , θ f ) is the salient map output function, and F represents the F-norm operation;
接下来,用随机梯度下降SGD方法,在对所有训练样本进行正则化的基础上,最小化上述代价函数;由于用于预训练的数据集没有同时具有分割和显著性标注,因此分割任务和显著性检测任务交替进行;训练过程需要将所有原始图像大小归一化;学习速率为0.001±0.01;动量参数通常为[0.9,1.0],权值衰减因子通常为0.0005±0.0002,;随机梯度下降学习进程共进行80000次以上迭代;详细的预训练过程如下:Next, use the stochastic gradient descent SGD method to minimize the above cost function on the basis of regularizing all training samples; since the data set used for pre-training does not have both segmentation and saliency labels, the segmentation task and saliency The task of sex detection is alternately performed; the training process needs to normalize the size of all original images; the learning rate is 0.001±0.01; the momentum parameter is usually [0.9,1.0], and the weight decay factor is usually 0.0005±0.0002; stochastic gradient descent learning The process performs more than 80,000 iterations in total; the detailed pre-training process is as follows:
1)共享全卷积参数基于VGGNet初始化;1) Shared full convolution parameters Initialization based on VGGNet;
2)通过正态分布随机初始化分割任务参数和显著性任务参数 2) Randomly initialize the segmentation task parameters through normal distribution and saliency task parameters
3)根据和利用SGD训练分割网络,更新这两个参数为和 3) According to with Using SGD to train the segmentation network, update these two parameters as with
4)根据和利用SGD训练显著性网络,更新相关参数为和 4) According to with Use SGD to train the saliency network, and update the relevant parameters as with
5)根据和利用SGD训练分割网络,获得和 5) According to with Using SGD to train the segmentation network, obtain with
6)根据和利用SGD训练显著性网络,更新相关参数为和 6) According to with Use SGD to train the saliency network, and update the relevant parameters as with
7)重复上述3-6步三次以获得预训练最终参数θs,θh,θf;7) Repeat the above steps 3-6 three times to obtain the pre-training final parameters θ s , θ h , θ f ;
步骤2.2:加入哈希层,针对目标域微调网络Step 2.2: Add a hash layer and fine-tune the network for the target domain
在预训练好的网络倒数第二层和最终的任务层中间,插入一个包含s个神经元的全连接层,即哈希层H,将高维特征映射到低维空间,生成二进制哈希码进行存储;哈希层H权重采用随机投影构造哈希值初始化,神经元激活函数采用sigmoid函数使输出值在0到1之间,神经元个数为目标二进制码的码长;Between the penultimate layer of the pre-trained network and the final task layer, insert a fully connected layer containing s neurons, that is, the hash layer H, which maps high-dimensional features to low-dimensional space and generates binary hash codes Store; the H weight of the hash layer is initialized with the hash value constructed by random projection, the neuron activation function uses the sigmoid function to make the output value between 0 and 1, and the number of neurons is the code length of the target binary code;
微调过程通过反向传播算法调节网络权重;网络微调为调节第十个卷积层之后的网络权重;用于微调网络的数据集数据量大小与预训练网络的数据集相比会减少10%-50%,相比预训练的网络参数,微调过程网络参数迭代次数和学习速率降低1%-10%,动量参数和权值衰减因子保持不变;The fine-tuning process adjusts the network weight through the back propagation algorithm; the network fine-tuning is to adjust the network weight after the tenth convolutional layer; the data size of the data set used for fine-tuning the network will be reduced by 10% compared with the data set of the pre-trained network- 50%, compared with the pre-trained network parameters, the number of iterations and learning rate of the network parameters in the fine-tuning process is reduced by 1%-10%, and the momentum parameters and weight decay factors remain unchanged;
详细的微调过程如下:The detailed fine-tuning process is as follows:
1)共享全卷积参数分割任务参数和显著性任务参数通过预训练过程得到;1) Shared full convolution parameters Split Task Parameters and saliency task parameters Obtained through the pre-training process;
2)根据和利用SGD训练分割网络,更新这两个参数为和 2) According to with Using SGD to train the segmentation network, update these two parameters as with
3)根据和利用SGD训练显著性网络,更新相关参数为和 3) According to with Use SGD to train the saliency network, and update the relevant parameters as with
4)根据和利用SGD训练分割网络,获得和 4) According to with Using SGD to train the segmentation network, obtain with
5)根据和利用SGD训练显著性网络,更新相关参数为和 5) According to with Use SGD to train the saliency network, and update the relevant parameters as with
6)重复上述3-6步三次以获得最终参数θs,θh,θf;6) Repeat the above steps 3-6 three times to obtain the final parameters θ s , θ h , θ f ;
步骤3:多层次深度检索Step 3: Multi-Level Depth Retrieval
步骤3.1:粗糙检索Step 3.1: Coarse retrieval
步骤3.1.1:生成二进制哈希码Step 3.1.1: Generate Binary Hash Code
将一幅待查询图像Iq输入到经过微调的神经网络,提取哈希层的输出作为图像签名,用Out(H)表示;二进制码根据阈值二值化激活值得到;对每一个二进制位r=1…s,根据公式(3)输出二进制码:Input an image I q to be queried into the fine-tuned neural network, extract the output of the hash layer as the image signature, denoted by Out(H); the binary code is obtained by binarizing the activation value according to the threshold; for each binary bit r =1...s, output binary code according to formula (3):
其中,s是哈希层神经元个数,初始值设置范围为[40,100];Γ={I1,I2,…,In}表示包含n幅图像的用于检索的数据集;相应的每幅图像的二进制码表示为ΓH={H1,H2,…,Hn},其中i=1…n,Hi∈{0,1}s表示s个神经元生成的s位二进制码值分别为0或1;Among them, s is the number of neurons in the hash layer, and the initial value setting range is [40,100]; Γ={I 1 ,I 2 ,…,I n } means the data set for retrieval containing n images; the corresponding The binary code of each image is expressed as Γ H ={H 1 ,H 2 ,…,H n }, where i=1…n, H i ∈{0,1} s represents the s-bit binary code generated by s neurons The code value is 0 or 1 respectively;
步骤3.1.2:汉明距离度量相似性Step 3.1.2: Hamming distance measures similarity
两个等长字符串之间的汉明距离是两个字符串对应位置的不同字符的个数;对于一幅待查询图像Iq和它的二进制码Hq,如果Hq和Hi∈ΓH之间的汉明距离小于设定的阈值,则定义一个包含m幅候选图片(candidates)的候选池P={Ic1,Ic2,…,Icm},汉明距离小于5认为两幅图像是相似的;The Hamming distance between two strings of equal length is the number of different characters in the corresponding positions of the two strings; for a query image I q and its binary code H q , if H q and H i ∈ Γ If the Hamming distance between H is less than the set threshold, then define a candidate pool P={I c1 , I c2 ,...,I cm } containing m candidate pictures (candidates), and if the Hamming distance is less than 5, two pictures are considered the images are similar;
步骤3.2:精细检索Step 3.2: Refined Search
步骤3.2.1:显著性特征提取Step 3.2.1: Salient Feature Extraction
将待查询图像Iq通过神经网络第13、15层卷积层生成的二维遥感影像特征图分别映射为一维向量进行存储;在后续检索过程中分别对比采用不同特征向量的检索结果决定最终选用哪一层卷积生成的特征图提取遥感影像显著性特征;The two-dimensional remote sensing image feature maps generated by the 13th and 15th convolutional layers of the neural network for the image I q to be queried are respectively mapped to one-dimensional vectors for storage; in the subsequent retrieval process, the retrieval results using different feature vectors are compared to determine the final Select the feature map generated by which layer of convolution to extract the salient features of remote sensing images;
步骤3.2.2:欧式距离度量相似性Step 3.2.2: Euclidean distance measure similarity
对于一幅查询图像Iq和一个候选池P,使用提取的显著性特征向量从候选池P中挑选出排名前k幅图像;Vq和分别表示查询图像q和Ici的特征向量;定义Iq和候选池P中第i幅图像相应特征向量之间的欧式距离si作为它们之间的相似性等级,如公式(4)所示;For a query image I q and a candidate pool P, use the extracted salient feature vectors to select the top k images from the candidate pool P; V q and denote the feature vectors of the query image q and I ci respectively; define the Euclidean distance s i between I q and the corresponding feature vector of the i-th image in the candidate pool P as the similarity level between them, as shown in formula (4) ;
欧式距离越小,两幅图像间的相似性越大;每幅候选图Ici根据和查询图像的相似度升序排序,排名前k的图像则为检索结果;The smaller the Euclidean distance, the greater the similarity between the two images; each candidate image I ci is sorted in ascending order according to the similarity with the query image, and the top k images are the retrieval results;
步骤3.3:检索结果评价Step 3.3: Evaluation of search results
使用基于排名的评价标准对检索结果进行评价;对于一幅查询图像q和得到的排名前k幅检索结果图像,查准率Precision根据以下公式计算:Use ranking-based evaluation criteria to evaluate the retrieval results; for a query image q and the obtained top k retrieval result images, the precision rate Precision is calculated according to the following formula:
其中,Precision@k表示设定阈值k,在检索到第k个正确结果为止,从第一个正确结果到第k个正确结果的平均正确率;Rel(i)表示查询图像q和排名第i幅图像的相关性,Rel(i)∈{0,1},1代表查询图像q和排名第i幅图像具有相同分类,即二者相关,0则不相关。Among them, Precision@k means to set the threshold k, and until the kth correct result is retrieved, the average accuracy rate from the first correct result to the kth correct result; Rel(i) means the query image q and the ranking i Rel(i)∈{0,1}, 1 means that the query image q and the i-th image have the same classification, that is, they are related, and 0 means they are not related.
本发明与现有技术相比,具有以下明显的优势和有益效果:Compared with the prior art, the present invention has the following obvious advantages and beneficial effects:
首先,相比传统人工提取遥感影像特征的方法,本发明利用全卷积神经网络构建深度显著性目标检测模型,选择国内外遥感影像数据库训练网络,综合分析图像的三层语义信息,自动学习遥感影像显著性特征。同时,创新性地语义分割加入全卷积神经网络对遥感影像深度显著性的学习,有效完善学习到的显著性特征。实验证实,采用该模型在场景较为复杂的多目标检测数据集上,如微软COCO数据集等均可提取到边缘较清晰的显著性目标。深层神经网络的学习能力可进一步迁移至对遥感影像的显著性特征学习。其次,本发明在全卷积神经网络架构中引入哈希层,在学习遥感影像深度显著性特征的同时生成二进制哈希码,既可节省存储空间,又可提高后续检索效率。最后,在进行图像检索时采用由粗到细的检索策略,综合利用二进制哈希码和显著性特征进行相似性度量。实验证实,在AlexNet神经网络中加入哈希层,并采用由粗到细的多层次检索策略,在250万张不同类别的普通图像检索中,统计返回排名前K幅相似图像的准确率,即topK查准率,当K取1000时,topK查准率平均可达88%,检索时间约为1s。因此,将该方法迁移至遥感影像的检索,对于实现遥感影像准确、高效检索切实可行并具有重要应用价值。First of all, compared with the traditional method of manually extracting remote sensing image features, the present invention uses a fully convolutional neural network to build a deep saliency target detection model, selects domestic and foreign remote sensing image databases for training networks, comprehensively analyzes the three-layer semantic information of images, and automatically learns remote sensing Distinctive features of the image. At the same time, the innovative semantic segmentation is added to the fully convolutional neural network to learn the depth saliency of remote sensing images, effectively improving the learned saliency features. Experiments have confirmed that this model can extract salient objects with clearer edges on multi-object detection data sets with more complex scenes, such as the Microsoft COCO data set. The learning ability of deep neural network can be further transferred to the salient feature learning of remote sensing images. Secondly, the present invention introduces a hash layer into the fully convolutional neural network architecture to generate binary hash codes while learning the depth salient features of remote sensing images, which can save storage space and improve subsequent retrieval efficiency. Finally, a coarse-to-fine retrieval strategy is adopted in image retrieval, and binary hash codes and salient features are used to measure the similarity. Experiments have confirmed that adding a hash layer to the AlexNet neural network, and adopting a multi-level retrieval strategy from coarse to fine, in the retrieval of 2.5 million ordinary images of different categories, the accuracy rate of the top K similar images returned by statistics, namely The topK precision rate, when K is 1000, the topK precision rate can reach 88% on average, and the retrieval time is about 1s. Therefore, transferring this method to the retrieval of remote sensing images is feasible and has important application value for realizing accurate and efficient retrieval of remote sensing images.
附图说明Description of drawings
图1基于深度显著性的遥感影像快速检索方法流程图;Fig. 1 Flowchart of fast remote sensing image retrieval method based on depth saliency;
图2基于深度显著性的目标检测模型架构图;Figure 2 Architecture diagram of target detection model based on depth saliency;
图3加入哈希层的神经网络架构图;Fig. 3 adds the neural network architecture diagram of the hash layer;
图4多层次检索过程图。Figure 4 is a diagram of the multi-level retrieval process.
具体实施方式detailed description
根据上述描述,以下是一个具体的实施流程,但本专利所保护的范围并不限于该实施流程。According to the above description, the following is a specific implementation process, but the protection scope of this patent is not limited to this implementation process.
步骤1:基于深度显著性的目标检测模型构建Step 1: Construction of object detection model based on deep saliency
显著性区域,主观理解为人眼视觉集中注意的区域,与人眼视觉系统(HumanVisual System,HVS)紧密相关,客观而言则是针对图像的某种特征,存在一个该特征最明显的子区。因此,显著性检测问题的关键在于特征学习和提取。鉴于深度学习在这一方面具有的强大功能,本发明将全卷积神经网络用于显著性检测问题,提出了基于全卷积神经网络的多任务显著性目标检测模型。该模型同时进行两个任务:显著性检测任务和语义分割任务。显著性检测任务用于对遥感影像的深度特征学习,计算深度显著性,语义分割任务用于提取图像内部对象语义信息,消除显著图背景混淆,补充显著性目标缺失部分。The salient area is subjectively understood as the area where the human eye focuses attention, and is closely related to the Human Visual System (HVS). Objectively speaking, it refers to a certain feature of the image, and there is a sub-area with the most obvious feature. Therefore, the crux of the saliency detection problem lies in feature learning and extraction. In view of the powerful function of deep learning in this aspect, the present invention uses the fully convolutional neural network for the salient detection problem, and proposes a multi-task salient target detection model based on the fully convolutional neural network. The model performs two tasks simultaneously: a saliency detection task and a semantic segmentation task. The saliency detection task is used to learn the depth features of remote sensing images and calculate the depth saliency. The semantic segmentation task is used to extract the semantic information of the internal objects in the image, eliminate the background confusion of the saliency map, and supplement the missing parts of the saliency target.
本发明提出的全卷积神经网络架构基于主流的开源深度学习框架Caffe实现,具体模型结构见附图2。输入一幅RGB图像,经过15个卷积层(Conv)进行一系列卷积操作,显著性检测任务和超像素目标语义分割任务共享卷积层。前13个卷积层经过卷积神经网络VGGNet初化,卷积核大小为3×3,每个卷积层后采用修正线性单元(Rectified LinearUnit,ReLU)作为激活函数,从而加快收敛速度。第2、4、5、13卷积层后进行最大值池化(MaxPooling)操作,降低特征维度,减少计算量的同时保证特征的不变性。第14、15卷积层的卷积核大小分别为7×7和1×1,每层卷积后连接Dropout层以解决复杂神经网络结构潜在的过拟合现象,即模型过度学习训练数据中的噪声和细节而导致在实际测试中错误率较高、泛化能力较差的问题。通过上采样构建反卷积层,通过双线性插值初始化其参数,在训练学习上采样函数中迭代更新。在显著性目标检测任务中通过sigmoid阈值函数将输出图像标准化至[0,1],学习显著性特征。在语义分割任务中用反卷积层对最后一个卷积层的特征图进行上采样,并且将上采样的结果进行剪裁(Crop),使输出图像与输入图像大小相同,从而对每个像素都产生了一个预测,同时保留了原始输入图像中的空间信息。The fully convolutional neural network architecture proposed by the present invention is implemented based on the mainstream open source deep learning framework Caffe, and the specific model structure is shown in Figure 2. Input an RGB image, and perform a series of convolution operations through 15 convolutional layers (Conv). The saliency detection task and the superpixel target semantic segmentation task share the convolutional layer. The first 13 convolutional layers are initialized by the convolutional neural network VGGNet, and the convolution kernel size is 3×3. After each convolutional layer, the Rectified Linear Unit (ReLU) is used as the activation function to speed up the convergence speed. After the 2nd, 4th, 5th, and 13th convolutional layers, the maximum pooling (MaxPooling) operation is performed to reduce the feature dimension and reduce the amount of calculation while ensuring the invariance of the features. The convolution kernel sizes of the 14th and 15th convolutional layers are 7×7 and 1×1 respectively, and each convolutional layer is connected to the Dropout layer to solve the potential over-fitting phenomenon of the complex neural network structure, that is, the model over-learning training data The noise and details of the model lead to high error rate and poor generalization ability in the actual test. The deconvolution layer is constructed by upsampling, its parameters are initialized by bilinear interpolation, and iteratively updated in the training and learning upsampling function. In the salient target detection task, the output image is normalized to [0,1] through the sigmoid threshold function to learn salient features. In the semantic segmentation task, the deconvolution layer is used to upsample the feature map of the last convolutional layer, and the upsampling result is cropped (Crop), so that the output image is the same size as the input image, so that each pixel is A prediction is produced while preserving the spatial information in the original input image.
步骤2:神经网络预训练并加入哈希层微调Step 2: Neural network pre-training and adding hash layer fine-tuning
本发明使用公开的大规模航拍图像数据集(AID)用于神经网络的预训练,旨在更好地学习遥感影像不同级别的语义特征。引入哈希层,利用扩充的武汉大学遥感影像数据集(WHU-RS)进一步微调网络,不但可以将神经网络学习的高维特征映射到低维,缩短检索时间,还能使神经网络学习到的特征更具鲁棒性。The present invention uses the publicly available large-scale aerial image data set (AID) for pre-training of neural networks, aiming at better learning semantic features of different levels of remote sensing images. Introducing the hash layer and using the expanded Wuhan University Remote Sensing Image Dataset (WHU-RS) to further fine-tune the network can not only map the high-dimensional features learned by the neural network to low-dimensional, shorten the retrieval time, but also make the neural network learned features are more robust.
步骤2.1:多任务显著性目标检测模型预训练Step 2.1: Multi-task salient object detection model pre-training
步骤2.1.1:构建预训练数据集Step 2.1.1: Build a pre-training dataset
预训练阶段选择公开的大规模航拍图像数据集(AID)作为标准数据集用于预训练。AID包含30个类别,10000幅航拍图像,所有图像均选自谷歌地球,经专业的遥感技术领域人员标注。每个分类的图像都取自不同国家、地区,在不同时间通过不同拍摄遥感探测仪拍摄,图像尺寸为600×600像素,分辨率为0.5m/像素到8m/像素不等。相比其他数据集,该数据集类内差距较小,类间差距较大,是目前航拍图像数据集中规模最大的数据集。In the pre-training stage, the public large-scale aerial image dataset (AID) is selected as the standard data set for pre-training. AID contains 30 categories and 10,000 aerial images, all of which are selected from Google Earth and annotated by professional remote sensing technical personnel. The images of each category are taken from different countries and regions, and taken by different remote sensing detectors at different times. The image size is 600×600 pixels, and the resolution ranges from 0.5m/pixel to 8m/pixel. Compared with other datasets, the intra-class gap of this dataset is small, and the inter-class gap is relatively large. It is currently the largest dataset in aerial image datasets.
步骤2.1.2:显著性目标检测模型预训练Step 2.1.2: Salient object detection model pre-training
FCNN预训练通过显著性检测任务和分割任务一同展开。χ表示N1幅宽高分别为W和Q的训练图像的集合,Xi为其中第i幅图像,Yijk表示第i幅宽高分别为j和k的图像相应的像素级真实分割图,其中i=1…N1,j=1…W,k=1…Q。Z表示N2幅训练图像的集合,Zn为其中第n幅图像,n=1…N2,它有相应的存在显著性目标的真实二值图像Mn。θs为共享卷积层参数,θh为分割任务参数,θf为显著性任务参数。公式(1)、公式(2)分别为分割任务的交叉熵代价函数J1(χ;θs,θh)和显著性检测任务的平方欧式距离代价函数J2(Z;θs,θf),FCNN通过最小化两个代价函数进行训练:FCNN pre-training is carried out through the saliency detection task and the segmentation task together. χ represents a collection of N1 training images whose width and height are respectively W and Q, Xi is the i-th image, and Y ijk represents the pixel-level real segmentation map corresponding to the i-th image whose width and height are j and k respectively, where i=1...N 1 , j=1...W, k=1...Q. Z represents a set of N 2 training images, and Z n is the nth image, n=1...N 2 , which has a corresponding real binary image M n with salient objects. θ s is the parameter of the shared convolutional layer, θ h is the parameter of the segmentation task, and θ f is the parameter of the saliency task. Formula (1) and formula (2) are respectively the cross-entropy cost function J 1 (χ; θ s , θ h ) of the segmentation task and the square Euclidean distance cost function J 2 (Z; θ s , θ f ), the FCNN is trained by minimizing two cost functions:
公式(1)中,是指示函数,hcjk是第c类置信分割图的元素(j,k),c=1…C,h(Xi;θs,θh)是语义分割函数,共返回C个目标类的置信分割图,C为预训练数据集包含的图像类别,本发明中C取30;公式(2)中,f(Zn;θs,θf)是显著图输出函数,F表示F-范数运算。In formula (1), is the indicator function, h cjk is the element (j,k) of the c-th confidence segmentation map, c=1...C, h(Xi; θ s , θ h ) is the semantic segmentation function, and returns the confidence of C target classes Segmentation map, C is the image category contained in the pre-training data set, and C is 30 in the present invention; in formula (2), f(Z n ; θ s , θ f ) is a saliency map output function, and F represents the F-norm operation.
接下来,用随机梯度下降(SGD)方法,在对所有训练样本进行正则化的基础上,最小化上述代价函数。由于用于预训练的数据集没有同时具有分割和显著性标注,因此分割任务和显著性检测任务交替进行。由于训练过程需要将所有原始图像大小归一化,因此本发明将原始图像重置大小为500×500像素用于预训练。学习速率是SGD学习方法的必要参数,决定了权值更新的速度,设置得太大会导致代价函数振荡,结果越过最优值,太小会使收敛速度过慢,一般倾向于选取较小的学习速率,如0.001±0.01以保持系统稳定。动量参数和权值衰减因子可提高训练自适应性,动量参数通常为[0.9,1.0],权值衰减因子通常为0.0005±0.0002。通过实验观察,本发明将学习速率设为10-10,动量参数设为0.99,权值衰减因子取Caffe框架默认值0.0005。随机梯度下降(SGD)学习进程通过NVIDIA GTX 1080GPU设备加速,共进行80000次迭代。详细的预训练过程如下:Next, the above cost function is minimized using stochastic gradient descent (SGD) with regularization over all training samples. Since the dataset used for pre-training does not have both segmentation and saliency annotations, the segmentation task and saliency detection task are alternated. Since the training process needs to normalize all original image sizes, the present invention resets the original image size to 500×500 pixels for pre-training. The learning rate is a necessary parameter of the SGD learning method, which determines the speed of weight update. If it is set too large, the cost function will oscillate, and the result will exceed the optimal value. If it is too small, the convergence speed will be too slow. Generally, a smaller learning rate is preferred. rate, such as 0.001±0.01 to keep the system stable. The momentum parameter and the weight decay factor can improve the training adaptability, the momentum parameter is usually [0.9,1.0], and the weight decay factor is usually 0.0005±0.0002. Through experimental observation, the present invention sets the learning rate to 10 -10 , the momentum parameter to 0.99, and the weight decay factor to the default value of 0.0005 in the Caffe framework. The stochastic gradient descent (SGD) learning process was accelerated by an NVIDIA GTX 1080 GPU device for a total of 80,000 iterations. The detailed pre-training process is as follows:
1)共享全卷积参数基于VGGNet初始化;1) Shared full convolution parameters Initialization based on VGGNet;
2)通过正态分布随机初始化分割任务参数和显著性任务参数 2) Randomly initialize the segmentation task parameters through normal distribution and saliency task parameters
3)根据和利用SGD训练分割网络,更新这两个参数为和 3) According to with Using SGD to train the segmentation network, update these two parameters as with
4)根据和利用SGD训练显著性网络,更新相关参数为和 4) According to with Use SGD to train the saliency network, and update the relevant parameters as with
5)根据和利用SGD训练分割网络,获得和 5) According to with Using SGD to train the segmentation network, obtain with
6)根据和利用SGD训练显著性网络,更新相关参数为和 6) According to with Use SGD to train the saliency network, and update the relevant parameters as with
7)重复上述3-6步三次以获得预训练最终参数θs,θh,θf。7) Repeat the above steps 3-6 three times to obtain the final pre-training parameters θ s , θ h , θ f .
步骤2.2:加入哈希层,针对目标域微调网络Step 2.2: Add a hash layer and fine-tune the network for the target domain
步骤2.2.1:构建用于微调网络的中国遥感影像数据集Step 2.2.1: Construct a Chinese remote sensing imagery dataset for fine-tuning the network
选用扩充的武汉大学遥感影像数据集(WHU-RS)用于神经网络微调。原始WHU-RS数据集包含19个场景分类,共950幅分辨率不等的遥感图像,图像尺寸为600×600像素,所有图像均取自谷歌地球。结合中国的地形地貌,在原始数据集的基础上重构并且扩展至7000幅遥感影像作为样本库,每个类别包含超过200幅图像。新增样本图像的光照、拍摄角度、分辨率及尺寸均不同,利于神经网络学习更具鲁棒性的显著性特征。The expanded Wuhan University Remote Sensing Image Dataset (WHU-RS) was selected for neural network fine-tuning. The original WHU-RS dataset contains 19 scene classifications, a total of 950 remote sensing images with different resolutions, and the image size is 600×600 pixels. All images are taken from Google Earth. Combined with China's topography and geomorphology, the original data set is reconstructed and expanded to 7000 remote sensing images as a sample library, each category contains more than 200 images. The illumination, shooting angle, resolution, and size of the newly added sample images are all different, which is conducive to the learning of more robust salient features by the neural network.
步骤2.2.2:加入哈希层微调网络Step 2.2.2: Join the hash layer to fine-tune the network
深度神经网络生成的特征向量维度较高,在大规模的图像检索中非常耗时。由于具有相似的图像二进制哈希码相似,因此,本发明在预训练好的网络倒数第二层和最终的任务层中间,插入一个包含s个神经元的全连接层,即哈希层H,将高维特征映射到低维空间,生成二进制哈希码进行存储,网络结构见附图3。哈希层H权重采用随机投影构造哈希值初始化,神经元激活函数采用sigmoid函数使输出值在0到1之间,根据经验设定阈值为0.5,神经元个数为目标二进制码的码长。哈希层不但提供了前一层的特征抽象,也是连接中级和高级图像语义特征的桥梁。The feature vectors generated by deep neural networks have high dimensionality, which is very time-consuming in large-scale image retrieval. Because the binary hash codes of similar images are similar, the present invention inserts a fully connected layer containing s neurons between the penultimate layer of the pre-trained network and the final task layer, that is, the hash layer H, Map high-dimensional features to low-dimensional space, and generate binary hash codes for storage. The network structure is shown in Figure 3. The H weight of the hash layer is initialized with the hash value constructed by random projection, the neuron activation function uses the sigmoid function to make the output value between 0 and 1, the threshold is set to 0.5 according to experience, and the number of neurons is the code length of the target binary code . The hash layer not only provides the feature abstraction of the previous layer, but also serves as a bridge connecting intermediate and high-level image semantic features.
微调过程通过反向传播(Back Propagation)算法调节网络权重。网络微调可针对整个网络或部分网络进行。由于低层网络结构学习到的特征更为一般化,并且为了避免发生过拟合,本发明利用扩充的WHU-RS数据集,重点调节高层网络,即第十个卷积层之后的网络权重。通常,用于微调网络的数据集数据量大小与预训练数据集相比会减少10%-50%,本发明中,微调网络数据集包含7000幅图像,明显小于预训练时包含10000幅图像的数据集,相比预训练的网络参数,微调过程网络参数要适当减小,迭代次数和学习速率可降低1%-10%。本发明中,微调过程将迭代次数减少至8000次,学习速率降低1%,为10-12,动量参数和权值衰减因子保持不变,即分别设为0.99和0.0005。The fine-tuning process adjusts the network weights through the Back Propagation algorithm. Network fine-tuning can be performed for the entire network or a portion of the network. Since the features learned by the low-level network structure are more general, and in order to avoid overfitting, the present invention uses the expanded WHU-RS data set to focus on adjusting the high-level network, that is, the network weight after the tenth convolutional layer. Usually, the data size of the data set used for fine-tuning the network will be reduced by 10%-50% compared with the pre-training data set. In the present invention, the fine-tuning network data set contains 7000 images, which is significantly smaller than the pre-training data set containing 10000 images. For the data set, compared with the pre-trained network parameters, the network parameters in the fine-tuning process should be appropriately reduced, and the number of iterations and learning rate can be reduced by 1%-10%. In the present invention, the number of iterations is reduced to 8000 during the fine-tuning process, the learning rate is reduced by 1% to 10 -12 , and the momentum parameter and weight decay factor remain unchanged, ie respectively set to 0.99 and 0.0005.
详细的微调过程如下:The detailed fine-tuning process is as follows:
1)共享全卷积参数分割任务参数和显著性任务参数通过预训练过程得到;1) Shared full convolution parameters Split Task Parameters and saliency task parameters Obtained through the pre-training process;
2)根据和利用SGD训练分割网络,更新这两个参数为和 2) According to with Using SGD to train the segmentation network, update these two parameters as with
3)根据和利用SGD训练显著性网络,更新相关参数为和 3) According to with Use SGD to train the saliency network, and update the relevant parameters as with
4)根据和利用SGD训练分割网络,获得和 4) According to with Using SGD to train the segmentation network, obtain with
5)根据和利用SGD训练显著性网络,更新相关参数为和 5) According to with Use SGD to train the saliency network, and update the relevant parameters as with
6)重复上述3-6步三次以获得最终参数θs,θh,θf。6) Repeat steps 3-6 above three times to obtain the final parameters θ s , θ h , θ f .
步骤3:多层次深度检索Step 3: Multi-Level Depth Retrieval
深度卷积神经网络的浅层部分学习底层视觉特征,而深层部分可捕捉图像语义信息。因此,本发明采用由粗到细的检索策略实现快速、准确的图像检索。特征提取及检索过程见附图4。The shallow part of the deep convolutional neural network learns the underlying visual features, while the deep part captures image semantic information. Therefore, the present invention adopts a coarse-to-fine retrieval strategy to realize fast and accurate image retrieval. The process of feature extraction and retrieval is shown in Figure 4.
步骤3.1:粗糙检索Step 3.1: Coarse retrieval
首先检索一系列有相似高级语义特征的候选区,即在哈希层拥有相似的二进制激活值,然后根据相似性度量进一步生成相似图像排名。First, a series of proposals with similar high-level semantic features are retrieved, i.e., have similar binary activation values at the hashing layer, and then a similar image ranking is further generated according to the similarity measure.
步骤3.1.1:生成二进制哈希码Step 3.1.1: Generate Binary Hash Code
将一幅待查询图像Iq输入到经过微调的神经网络,提取哈希层的输出作为图像签名,用Out(H)表示。二进制码根据阈值二值化激活值得到。对每一个二进制位r=1…s,根据公式(3)输出二进制码:Input an image I q to be queried into the fine-tuned neural network, and extract the output of the hash layer as the image signature, denoted by Out(H). The binary code is obtained by binarizing the activation value according to the threshold. For each binary bit r=1...s, output binary code according to formula (3):
其中,s是哈希层神经元个数,个数过多会出现过拟合,建议初始值设置范围为[40,100],具体数值根据实训练数据进行调整,本发明中s设为48。Γ={I1,I2,…,In}表示包含n幅图像的用于检索的数据集。相应的每幅图像的二进制码表示为ΓH={H1,H2,…,Hn},其中i=1…n,Hi∈{0,1}s表示s个神经元生成的s位二进制码值分别为0或1。Among them, s is the number of neurons in the hash layer. If the number is too large, overfitting will occur. The recommended initial value setting range is [40,100]. The specific value is adjusted according to the actual training data. In the present invention, s is set to 48. Γ={I 1 , I 2 ,...,I n } represents a data set for retrieval including n images. The corresponding binary code of each image is expressed as Γ H ={H 1 ,H 2 ,…,H n }, where i=1…n, H i ∈ {0,1} s represents s generated by s neurons Bit binary code value is 0 or 1 respectively.
步骤3.1.2:汉明距离度量相似性Step 3.1.2: Hamming distance measures similarity
两个等长字符串之间的汉明距离是两个字符串对应位置的不同字符的个数。对于一幅待查询图像Iq和它的二进制码Hq,如果Hq和Hi∈ΓH之间的汉明距离小于设定的阈值,则定义一个包含m幅候选图片(candidates)的候选池P={Ic1,Ic2,…,Icm},一般情况下,汉明距离小于5就可以认为两幅图像是相似的。The Hamming distance between two strings of equal length is the number of different characters in the corresponding positions of the two strings. For an image to be queried I q and its binary code H q , if the Hamming distance between H q and H i ∈ Γ H is smaller than the set threshold, define a candidate that contains m candidate pictures (candidates) Pool P={I c1 , I c2 ,...,I cm }, in general, two images can be considered similar if the Hamming distance is less than 5.
步骤3.2:精细检索Step 3.2: Refined Search
步骤3.2.1:显著性特征提取Step 3.2.1: Salient Feature Extraction
由于深度卷积网络不同卷积层学习不同图像不同级别的语义特征,其中,中高层卷积层学习到的特征更适用与图像检索任务。因此,将待查询图像Iq通过神经网络第13、15层卷积层生成的二维遥感影像特征图分别映射为一维向量进行存储。在后续检索过程中分别对比采用不同特征向量的检索结果决定最终选用哪一层卷积生成的特征图提取遥感影像显著性特征。Since different convolutional layers of deep convolutional networks learn different levels of semantic features of different images, among them, the features learned by middle and high-level convolutional layers are more suitable for image retrieval tasks. Therefore, the two-dimensional remote sensing image feature maps generated by the image I q to be queried through the 13th and 15th convolutional layers of the neural network are respectively mapped into one-dimensional vectors for storage. In the subsequent retrieval process, the retrieval results using different feature vectors were compared to determine which layer of convolution generated feature maps was finally selected to extract the salient features of remote sensing images.
步骤3.2.2:欧式距离度量相似性Step 3.2.2: Euclidean distance measure similarity
对于一幅查询图像Iq和一个候选池P,使用提取的显著性特征向量从候选池P中挑选出排名前k幅图像。Vq和分别表示查询图像q和Ici的特征向量。定义Iq和候选池P中第i幅图像相应特征向量之间的欧式距离si作为它们之间的相似性等级,如公式(4)所示。For a query image I q and a candidate pool P, use the extracted salient feature vectors to pick out the top k images from the candidate pool P. V q and denote the feature vectors of the query image q and I ci , respectively. Define the Euclidean distance s i between I q and the corresponding feature vector of the i-th image in the candidate pool P as the similarity level between them, as shown in formula (4).
欧式距离越小,两幅图像间的相似性越大。每幅候选图Ici根据和查询图像的相似度升序排序,排名前k的图像则为检索结果。The smaller the Euclidean distance, the greater the similarity between two images. Each candidate image I ci is sorted in ascending order according to the similarity with the query image, and the top k images are the retrieval results.
步骤3.3:检索结果评价Step 3.3: Evaluation of search results
本发明使用基于排名的评价标准对检索结果进行评价。对于一幅查询图像q和得到的排名前k幅检索结果图像,查准率(Precision)根据以下公式计算:The present invention uses ranking-based evaluation criteria to evaluate retrieval results. For a query image q and the obtained top k retrieval result images, the precision rate (Precision) is calculated according to the following formula:
其中,Precision@k表示根据实际需求设定阈值k,在检索到第k个正确结果为止,从第一个正确结果到第k个正确结果的平均正确率;Rel(i)表示查询图像q和排名第i幅图像的相关性,Rel(i)∈{0,1},1代表查询图像q和排名第i幅图像具有相同分类,即二者相关,0则不相关。Among them, Precision@k indicates that the threshold k is set according to actual needs, and the average accuracy rate from the first correct result to the kth correct result is retrieved until the kth correct result is retrieved; Rel(i) indicates the query image q and The correlation of the i-th image, Rel(i)∈{0,1}, 1 means that the query image q and the i-th image have the same classification, that is, they are related, and 0 means they are not related.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710087670.5A CN106909924B (en) | 2017-02-18 | 2017-02-18 | Remote sensing image rapid retrieval method based on depth significance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710087670.5A CN106909924B (en) | 2017-02-18 | 2017-02-18 | Remote sensing image rapid retrieval method based on depth significance |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106909924A true CN106909924A (en) | 2017-06-30 |
CN106909924B CN106909924B (en) | 2020-08-28 |
Family
ID=59207582
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710087670.5A Active CN106909924B (en) | 2017-02-18 | 2017-02-18 | Remote sensing image rapid retrieval method based on depth significance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106909924B (en) |
Cited By (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291945A (en) * | 2017-07-12 | 2017-10-24 | 上海交通大学 | The high-precision image of clothing search method and system of view-based access control model attention model |
CN107392925A (en) * | 2017-08-01 | 2017-11-24 | 西安电子科技大学 | Remote sensing image terrain classification method based on super-pixel coding and convolutional neural networks |
CN107463932A (en) * | 2017-07-13 | 2017-12-12 | 央视国际网络无锡有限公司 | A kind of method that picture feature is extracted using binary system bottleneck neutral net |
CN107480261A (en) * | 2017-08-16 | 2017-12-15 | 上海荷福人工智能科技(集团)有限公司 | One kind is based on deep learning fine granularity facial image method for quickly retrieving |
CN107729992A (en) * | 2017-10-27 | 2018-02-23 | 深圳市未来媒体技术研究院 | A kind of deep learning method based on backpropagation |
CN108090117A (en) * | 2017-11-06 | 2018-05-29 | 北京三快在线科技有限公司 | A kind of image search method and device, electronic equipment |
CN108257139A (en) * | 2018-02-26 | 2018-07-06 | 中国科学院大学 | RGB-D three-dimension object detection methods based on deep learning |
CN108287926A (en) * | 2018-03-02 | 2018-07-17 | 宿州学院 | A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework |
CN108427738A (en) * | 2018-03-01 | 2018-08-21 | 中山大学 | A kind of fast image retrieval method based on deep learning |
CN108446312A (en) * | 2018-02-06 | 2018-08-24 | 西安电子科技大学 | Remote sensing image search method based on depth convolution semantic net |
CN108647655A (en) * | 2018-05-16 | 2018-10-12 | 北京工业大学 | Low latitude aerial images power line foreign matter detecting method based on light-duty convolutional neural networks |
CN109033505A (en) * | 2018-06-06 | 2018-12-18 | 东北大学 | A kind of ultrafast cold temprature control method based on deep learning |
CN109035315A (en) * | 2018-08-28 | 2018-12-18 | 武汉大学 | Merge the remote sensing image registration method and system of SIFT feature and CNN feature |
CN109063569A (en) * | 2018-07-04 | 2018-12-21 | 北京航空航天大学 | A kind of semantic class change detecting method based on remote sensing image |
CN109101907A (en) * | 2018-07-28 | 2018-12-28 | 华中科技大学 | A kind of vehicle-mounted image, semantic segmenting system based on bilateral segmentation network |
CN109191426A (en) * | 2018-07-24 | 2019-01-11 | 江南大学 | A kind of flat image conspicuousness detection method |
CN109284741A (en) * | 2018-10-30 | 2019-01-29 | 武汉大学 | A large-scale remote sensing image retrieval method and system based on deep hash network |
CN109389051A (en) * | 2018-09-20 | 2019-02-26 | 华南农业大学 | A kind of building remote sensing images recognition methods based on convolutional neural networks |
CN109389128A (en) * | 2018-08-24 | 2019-02-26 | 中国石油天然气股份有限公司 | Automatic extraction method and device for electric imaging logging image characteristics |
CN109410211A (en) * | 2017-08-18 | 2019-03-01 | 北京猎户星空科技有限公司 | The dividing method and device of target object in a kind of image |
CN109522821A (en) * | 2018-10-30 | 2019-03-26 | 武汉大学 | A kind of extensive across source Remote Sensing Image Retrieval method based on cross-module state depth Hash network |
CN109522435A (en) * | 2018-11-15 | 2019-03-26 | 中国银联股份有限公司 | A kind of image search method and device |
CN109639964A (en) * | 2018-11-26 | 2019-04-16 | 北京达佳互联信息技术有限公司 | Image processing method, processing unit and computer readable storage medium |
CN109657522A (en) * | 2017-10-10 | 2019-04-19 | 北京京东尚科信息技术有限公司 | Detect the method and apparatus that can travel region |
CN109670057A (en) * | 2019-01-03 | 2019-04-23 | 电子科技大学 | A kind of gradual end-to-end depth characteristic quantization system and method |
EP3477555A1 (en) * | 2017-10-31 | 2019-05-01 | General Electric Company | Multi-task feature selection neural networks |
CN109753576A (en) * | 2018-12-25 | 2019-05-14 | 上海七印信息科技有限公司 | A kind of method for retrieving similar images |
CN109766467A (en) * | 2018-12-28 | 2019-05-17 | 珠海大横琴科技发展有限公司 | Remote sensing image retrieval method and system based on image segmentation and improvement VLAD |
CN109766938A (en) * | 2018-12-28 | 2019-05-17 | 武汉大学 | Multi-class object detection method in remote sensing image based on scene label constrained deep network |
CN109886221A (en) * | 2019-02-26 | 2019-06-14 | 浙江水利水电学院 | Sand dredger recognition methods based on saliency detection |
CN109902192A (en) * | 2019-01-15 | 2019-06-18 | 华南师范大学 | Remote sensing image retrieval method, system, equipment and medium based on unsupervised depth regression |
CN109919059A (en) * | 2019-02-26 | 2019-06-21 | 四川大学 | A salient object detection method based on deep network hierarchy and multi-task training |
CN109919108A (en) * | 2019-03-11 | 2019-06-21 | 西安电子科技大学 | A fast target detection method for remote sensing images based on deep hash-aided network |
CN110020658A (en) * | 2019-03-28 | 2019-07-16 | 大连理工大学 | A kind of well-marked target detection method based on multitask deep learning |
WO2019136591A1 (en) * | 2018-01-09 | 2019-07-18 | 深圳大学 | Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network |
CN110263799A (en) * | 2019-06-26 | 2019-09-20 | 山东浪潮人工智能研究院有限公司 | A kind of image classification method and device based on the study of depth conspicuousness similar diagram |
CN110334765A (en) * | 2019-07-05 | 2019-10-15 | 西安电子科技大学 | Remote sensing image classification method based on multi-scale deep learning of attention mechanism |
CN110399847A (en) * | 2019-07-30 | 2019-11-01 | 北京字节跳动网络技术有限公司 | Extraction method of key frame, device and electronic equipment |
CN110414301A (en) * | 2018-04-28 | 2019-11-05 | 中山大学 | A method for estimating crowd density in train cars based on dual cameras |
CN110414513A (en) * | 2019-07-31 | 2019-11-05 | 电子科技大学 | Visual Saliency Detection Method Based on Semantic Enhanced Convolutional Neural Network |
CN110580503A (en) * | 2019-08-22 | 2019-12-17 | 江苏和正特种装备有限公司 | AI-based double-spectrum target automatic identification method |
WO2019237646A1 (en) * | 2018-06-14 | 2019-12-19 | 清华大学深圳研究生院 | Image retrieval method based on deep learning and semantic segmentation |
CN110633633A (en) * | 2019-08-08 | 2019-12-31 | 北京工业大学 | A Road Extraction Method Based on Adaptive Threshold from Remote Sensing Image |
CN110765886A (en) * | 2019-09-29 | 2020-02-07 | 深圳大学 | Road target detection method and device based on convolutional neural network |
CN110852295A (en) * | 2019-10-15 | 2020-02-28 | 深圳龙岗智能视听研究院 | Video behavior identification method based on multitask supervised learning |
CN110853053A (en) * | 2019-10-25 | 2020-02-28 | 天津大学 | Salient object detection method taking multiple candidate objects as semantic knowledge |
CN110866425A (en) * | 2018-08-28 | 2020-03-06 | 天津理工大学 | Pedestrian identification method based on light field camera and depth migration learning |
CN110945535A (en) * | 2017-07-26 | 2020-03-31 | 国际商业机器公司 | System and method for constructing synaptic weights for artificial neural networks from signed simulated conductance pairs with varying significance |
CN111160127A (en) * | 2019-12-11 | 2020-05-15 | 中国资源卫星应用中心 | A remote sensing image processing and detection method based on a deep convolutional neural network model |
CN111260021A (en) * | 2018-11-30 | 2020-06-09 | 百度(美国)有限责任公司 | Predictive deep learning scaling |
CN111368109A (en) * | 2018-12-26 | 2020-07-03 | 北京眼神智能科技有限公司 | Remote sensing image retrieval method and device, computer readable storage medium and equipment |
CN111640087A (en) * | 2020-04-14 | 2020-09-08 | 中国测绘科学研究院 | Image change detection method based on SAR (synthetic aperture radar) deep full convolution neural network |
CN111695572A (en) * | 2019-12-27 | 2020-09-22 | 珠海大横琴科技发展有限公司 | Ship retrieval method and device based on convolutional layer feature extraction |
CN112052736A (en) * | 2020-08-06 | 2020-12-08 | 浙江理工大学 | Cloud computing platform-based field tea tender shoot detection method |
CN112102245A (en) * | 2020-08-17 | 2020-12-18 | 清华大学 | Grape fetus slice image processing method and device based on deep learning |
CN112541912A (en) * | 2020-12-23 | 2021-03-23 | 中国矿业大学 | Method and device for rapidly detecting saliency target in mine sudden disaster scene |
CN112579816A (en) * | 2020-12-29 | 2021-03-30 | 二十一世纪空间技术应用股份有限公司 | Remote sensing image retrieval method and device, electronic equipment and storage medium |
CN112667832A (en) * | 2020-12-31 | 2021-04-16 | 哈尔滨工业大学 | Vision-based mutual positioning method in unknown indoor environment |
CN112712090A (en) * | 2019-10-24 | 2021-04-27 | 北京易真学思教育科技有限公司 | Image processing method, device, equipment and storage medium |
CN112801192A (en) * | 2021-01-26 | 2021-05-14 | 北京工业大学 | Extended LargeVis image feature dimension reduction method based on deep neural network |
CN112926667A (en) * | 2021-03-05 | 2021-06-08 | 中南民族大学 | Method and device for detecting saliency target of depth fusion edge and high-level feature |
CN113205481A (en) * | 2021-03-19 | 2021-08-03 | 浙江科技学院 | Salient object detection method based on stepped progressive neural network |
CN113326926A (en) * | 2021-06-30 | 2021-08-31 | 上海理工大学 | Fully-connected Hash neural network for remote sensing image retrieval |
CN114764822A (en) * | 2021-01-12 | 2022-07-19 | 中国移动通信有限公司研究院 | Image processing method and device and electronic equipment |
CN115292530A (en) * | 2022-09-30 | 2022-11-04 | 北京数慧时空信息技术有限公司 | Remote sensing image overall management system |
US11618438B2 (en) * | 2018-03-26 | 2023-04-04 | International Business Machines Corporation | Three-dimensional object localization for obstacle avoidance using one-shot convolutional neural network |
CN116089646A (en) * | 2023-01-04 | 2023-05-09 | 武汉理工大学 | Unmanned aerial vehicle image hash retrieval method based on saliency capture mechanism |
CN116894100A (en) * | 2023-07-24 | 2023-10-17 | 北京和德宇航技术有限公司 | Remote sensing image display control method, device and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140122400A1 (en) * | 2012-10-25 | 2014-05-01 | Brain Corporation | Apparatus and methods for activity-based plasticity in a spiking neuron network |
CN105243154A (en) * | 2015-10-27 | 2016-01-13 | 武汉大学 | Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings |
CN105550709A (en) * | 2015-12-14 | 2016-05-04 | 武汉大学 | Remote sensing image power transmission line corridor forest region extraction method |
US20160232430A1 (en) * | 2014-05-29 | 2016-08-11 | International Business Machines Corporation | Scene understanding using a neurosynaptic system |
CN106227851A (en) * | 2016-07-29 | 2016-12-14 | 汤平 | Based on the image search method searched for by depth of seam division that degree of depth convolutional neural networks is end-to-end |
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
CN106296692A (en) * | 2016-08-11 | 2017-01-04 | 深圳市未来媒体技术研究院 | Image significance detection method based on antagonism network |
CN106354735A (en) * | 2015-07-22 | 2017-01-25 | 杭州海康威视数字技术股份有限公司 | Image target searching method and device |
CN106408001A (en) * | 2016-08-26 | 2017-02-15 | 西安电子科技大学 | Rapid area-of-interest detection method based on depth kernelized hashing |
-
2017
- 2017-02-18 CN CN201710087670.5A patent/CN106909924B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140122400A1 (en) * | 2012-10-25 | 2014-05-01 | Brain Corporation | Apparatus and methods for activity-based plasticity in a spiking neuron network |
US20160232430A1 (en) * | 2014-05-29 | 2016-08-11 | International Business Machines Corporation | Scene understanding using a neurosynaptic system |
CN106354735A (en) * | 2015-07-22 | 2017-01-25 | 杭州海康威视数字技术股份有限公司 | Image target searching method and device |
CN105243154A (en) * | 2015-10-27 | 2016-01-13 | 武汉大学 | Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings |
CN105550709A (en) * | 2015-12-14 | 2016-05-04 | 武汉大学 | Remote sensing image power transmission line corridor forest region extraction method |
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN106227851A (en) * | 2016-07-29 | 2016-12-14 | 汤平 | Based on the image search method searched for by depth of seam division that degree of depth convolutional neural networks is end-to-end |
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
CN106296692A (en) * | 2016-08-11 | 2017-01-04 | 深圳市未来媒体技术研究院 | Image significance detection method based on antagonism network |
CN106408001A (en) * | 2016-08-26 | 2017-02-15 | 西安电子科技大学 | Rapid area-of-interest detection method based on depth kernelized hashing |
Non-Patent Citations (5)
Title |
---|
XIA RONGKAI 等: "Supervised Hashing for Image Retrieval via Image Representation Learning", 《PROCEEDINGS OF THE AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
YIN LI 等: "The secrets of salient object segmentation", 《2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
刘冶 等: "FP-CNNH:一种基于深度卷积神经网络的快速图像哈希算法", 《计算机科学》 * |
柯圣财 等: "基于卷积神经网络和监督核哈希的图像检索方法", 《电子学报》 * |
龚震霆 等: "基于卷积神经网络和哈希编码的图像检索方法", 《智能系统学报》 * |
Cited By (101)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291945A (en) * | 2017-07-12 | 2017-10-24 | 上海交通大学 | The high-precision image of clothing search method and system of view-based access control model attention model |
CN107463932A (en) * | 2017-07-13 | 2017-12-12 | 央视国际网络无锡有限公司 | A kind of method that picture feature is extracted using binary system bottleneck neutral net |
CN107463932B (en) * | 2017-07-13 | 2020-07-10 | 央视国际网络无锡有限公司 | Method for extracting picture features by using binary bottleneck neural network |
CN110945535B (en) * | 2017-07-26 | 2024-01-26 | 国际商业机器公司 | Method for realizing artificial neural network ANN |
CN110945535A (en) * | 2017-07-26 | 2020-03-31 | 国际商业机器公司 | System and method for constructing synaptic weights for artificial neural networks from signed simulated conductance pairs with varying significance |
CN107392925A (en) * | 2017-08-01 | 2017-11-24 | 西安电子科技大学 | Remote sensing image terrain classification method based on super-pixel coding and convolutional neural networks |
CN107392925B (en) * | 2017-08-01 | 2020-07-07 | 西安电子科技大学 | Remote sensing image ground object classification method based on super-pixel coding and convolutional neural network |
CN107480261A (en) * | 2017-08-16 | 2017-12-15 | 上海荷福人工智能科技(集团)有限公司 | One kind is based on deep learning fine granularity facial image method for quickly retrieving |
CN107480261B (en) * | 2017-08-16 | 2020-06-16 | 上海荷福人工智能科技(集团)有限公司 | Fine-grained face image fast retrieval method based on deep learning |
CN109410211A (en) * | 2017-08-18 | 2019-03-01 | 北京猎户星空科技有限公司 | The dividing method and device of target object in a kind of image |
CN109657522A (en) * | 2017-10-10 | 2019-04-19 | 北京京东尚科信息技术有限公司 | Detect the method and apparatus that can travel region |
CN107729992B (en) * | 2017-10-27 | 2020-12-29 | 深圳市未来媒体技术研究院 | Deep learning method based on back propagation |
CN107729992A (en) * | 2017-10-27 | 2018-02-23 | 深圳市未来媒体技术研究院 | A kind of deep learning method based on backpropagation |
EP3477555A1 (en) * | 2017-10-31 | 2019-05-01 | General Electric Company | Multi-task feature selection neural networks |
CN109726812A (en) * | 2017-10-31 | 2019-05-07 | 通用电气公司 | Feature Ranking Neural Networks and Methods, Methods for Generating Simplified Feature Set Models |
CN108090117A (en) * | 2017-11-06 | 2018-05-29 | 北京三快在线科技有限公司 | A kind of image search method and device, electronic equipment |
US11281714B2 (en) | 2017-11-06 | 2022-03-22 | Beijing Sankuai Online Technology Co., Ltd | Image retrieval |
WO2019136591A1 (en) * | 2018-01-09 | 2019-07-18 | 深圳大学 | Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network |
CN108446312B (en) * | 2018-02-06 | 2020-04-21 | 西安电子科技大学 | Optical Remote Sensing Image Retrieval Method Based on Deep Convolutional Semantic Web |
CN108446312A (en) * | 2018-02-06 | 2018-08-24 | 西安电子科技大学 | Remote sensing image search method based on depth convolution semantic net |
CN108257139A (en) * | 2018-02-26 | 2018-07-06 | 中国科学院大学 | RGB-D three-dimension object detection methods based on deep learning |
CN108257139B (en) * | 2018-02-26 | 2020-09-08 | 中国科学院大学 | RGB-D three-dimensional object detection method based on deep learning |
CN108427738A (en) * | 2018-03-01 | 2018-08-21 | 中山大学 | A kind of fast image retrieval method based on deep learning |
CN108287926A (en) * | 2018-03-02 | 2018-07-17 | 宿州学院 | A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework |
US11618438B2 (en) * | 2018-03-26 | 2023-04-04 | International Business Machines Corporation | Three-dimensional object localization for obstacle avoidance using one-shot convolutional neural network |
CN110414301A (en) * | 2018-04-28 | 2019-11-05 | 中山大学 | A method for estimating crowd density in train cars based on dual cameras |
CN108647655A (en) * | 2018-05-16 | 2018-10-12 | 北京工业大学 | Low latitude aerial images power line foreign matter detecting method based on light-duty convolutional neural networks |
CN108647655B (en) * | 2018-05-16 | 2022-07-12 | 北京工业大学 | Low-altitude aerial image power line foreign object detection method based on light convolutional neural network |
CN109033505A (en) * | 2018-06-06 | 2018-12-18 | 东北大学 | A kind of ultrafast cold temprature control method based on deep learning |
WO2019237646A1 (en) * | 2018-06-14 | 2019-12-19 | 清华大学深圳研究生院 | Image retrieval method based on deep learning and semantic segmentation |
CN109063569B (en) * | 2018-07-04 | 2021-08-24 | 北京航空航天大学 | A Semantic-level Change Detection Method Based on Remote Sensing Images |
CN109063569A (en) * | 2018-07-04 | 2018-12-21 | 北京航空航天大学 | A kind of semantic class change detecting method based on remote sensing image |
CN109191426A (en) * | 2018-07-24 | 2019-01-11 | 江南大学 | A kind of flat image conspicuousness detection method |
CN109101907B (en) * | 2018-07-28 | 2020-10-30 | 华中科技大学 | A Vehicle Image Semantic Segmentation System Based on Bilateral Segmentation Network |
CN109101907A (en) * | 2018-07-28 | 2018-12-28 | 华中科技大学 | A kind of vehicle-mounted image, semantic segmenting system based on bilateral segmentation network |
US11010629B2 (en) | 2018-08-24 | 2021-05-18 | Petrochina Company Limited | Method for automatically extracting image features of electrical imaging well logging, computer equipment and non-transitory computer readable medium |
CN109389128A (en) * | 2018-08-24 | 2019-02-26 | 中国石油天然气股份有限公司 | Automatic extraction method and device for electric imaging logging image characteristics |
CN110866425A (en) * | 2018-08-28 | 2020-03-06 | 天津理工大学 | Pedestrian identification method based on light field camera and depth migration learning |
CN109035315A (en) * | 2018-08-28 | 2018-12-18 | 武汉大学 | Merge the remote sensing image registration method and system of SIFT feature and CNN feature |
CN109389051A (en) * | 2018-09-20 | 2019-02-26 | 华南农业大学 | A kind of building remote sensing images recognition methods based on convolutional neural networks |
CN109284741A (en) * | 2018-10-30 | 2019-01-29 | 武汉大学 | A large-scale remote sensing image retrieval method and system based on deep hash network |
CN109522821A (en) * | 2018-10-30 | 2019-03-26 | 武汉大学 | A kind of extensive across source Remote Sensing Image Retrieval method based on cross-module state depth Hash network |
WO2020098296A1 (en) * | 2018-11-15 | 2020-05-22 | 中国银联股份有限公司 | Image retrieval method and device |
CN109522435A (en) * | 2018-11-15 | 2019-03-26 | 中国银联股份有限公司 | A kind of image search method and device |
CN109522435B (en) * | 2018-11-15 | 2022-05-20 | 中国银联股份有限公司 | Image retrieval method and device |
CN109639964A (en) * | 2018-11-26 | 2019-04-16 | 北京达佳互联信息技术有限公司 | Image processing method, processing unit and computer readable storage medium |
CN111260021A (en) * | 2018-11-30 | 2020-06-09 | 百度(美国)有限责任公司 | Predictive deep learning scaling |
CN111260021B (en) * | 2018-11-30 | 2024-04-05 | 百度(美国)有限责任公司 | Prediction deep learning scaling |
CN109753576A (en) * | 2018-12-25 | 2019-05-14 | 上海七印信息科技有限公司 | A kind of method for retrieving similar images |
CN111368109B (en) * | 2018-12-26 | 2023-04-28 | 北京眼神智能科技有限公司 | Remote sensing image retrieval method, remote sensing image retrieval device, computer readable storage medium and computer readable storage device |
CN111368109A (en) * | 2018-12-26 | 2020-07-03 | 北京眼神智能科技有限公司 | Remote sensing image retrieval method and device, computer readable storage medium and equipment |
CN109766467A (en) * | 2018-12-28 | 2019-05-17 | 珠海大横琴科技发展有限公司 | Remote sensing image retrieval method and system based on image segmentation and improvement VLAD |
CN109766938A (en) * | 2018-12-28 | 2019-05-17 | 武汉大学 | Multi-class object detection method in remote sensing image based on scene label constrained deep network |
CN109670057B (en) * | 2019-01-03 | 2021-06-29 | 电子科技大学 | A progressive end-to-end deep feature quantization system and method |
CN109670057A (en) * | 2019-01-03 | 2019-04-23 | 电子科技大学 | A kind of gradual end-to-end depth characteristic quantization system and method |
CN109902192A (en) * | 2019-01-15 | 2019-06-18 | 华南师范大学 | Remote sensing image retrieval method, system, equipment and medium based on unsupervised depth regression |
CN109919059B (en) * | 2019-02-26 | 2021-01-26 | 四川大学 | Salient object detection method based on deep network layering and multi-task training |
CN109886221A (en) * | 2019-02-26 | 2019-06-14 | 浙江水利水电学院 | Sand dredger recognition methods based on saliency detection |
CN109919059A (en) * | 2019-02-26 | 2019-06-21 | 四川大学 | A salient object detection method based on deep network hierarchy and multi-task training |
CN109919108A (en) * | 2019-03-11 | 2019-06-21 | 西安电子科技大学 | A fast target detection method for remote sensing images based on deep hash-aided network |
CN109919108B (en) * | 2019-03-11 | 2022-12-06 | 西安电子科技大学 | Fast Object Detection Method for Remote Sensing Image Based on Deep Hash Assisted Network |
CN110020658A (en) * | 2019-03-28 | 2019-07-16 | 大连理工大学 | A kind of well-marked target detection method based on multitask deep learning |
CN110263799A (en) * | 2019-06-26 | 2019-09-20 | 山东浪潮人工智能研究院有限公司 | A kind of image classification method and device based on the study of depth conspicuousness similar diagram |
CN110334765B (en) * | 2019-07-05 | 2023-03-24 | 西安电子科技大学 | Remote sensing image classification method based on attention mechanism multi-scale deep learning |
CN110334765A (en) * | 2019-07-05 | 2019-10-15 | 西安电子科技大学 | Remote sensing image classification method based on multi-scale deep learning of attention mechanism |
CN110399847A (en) * | 2019-07-30 | 2019-11-01 | 北京字节跳动网络技术有限公司 | Extraction method of key frame, device and electronic equipment |
CN110399847B (en) * | 2019-07-30 | 2021-11-09 | 北京字节跳动网络技术有限公司 | Key frame extraction method and device and electronic equipment |
CN110414513A (en) * | 2019-07-31 | 2019-11-05 | 电子科技大学 | Visual Saliency Detection Method Based on Semantic Enhanced Convolutional Neural Network |
CN110633633A (en) * | 2019-08-08 | 2019-12-31 | 北京工业大学 | A Road Extraction Method Based on Adaptive Threshold from Remote Sensing Image |
CN110580503A (en) * | 2019-08-22 | 2019-12-17 | 江苏和正特种装备有限公司 | AI-based double-spectrum target automatic identification method |
CN110765886A (en) * | 2019-09-29 | 2020-02-07 | 深圳大学 | Road target detection method and device based on convolutional neural network |
CN110765886B (en) * | 2019-09-29 | 2022-05-03 | 深圳大学 | A method and device for road target detection based on convolutional neural network |
CN110852295A (en) * | 2019-10-15 | 2020-02-28 | 深圳龙岗智能视听研究院 | Video behavior identification method based on multitask supervised learning |
CN110852295B (en) * | 2019-10-15 | 2023-08-25 | 深圳龙岗智能视听研究院 | Video behavior recognition method based on multitasking supervised learning |
CN112712090A (en) * | 2019-10-24 | 2021-04-27 | 北京易真学思教育科技有限公司 | Image processing method, device, equipment and storage medium |
CN110853053A (en) * | 2019-10-25 | 2020-02-28 | 天津大学 | Salient object detection method taking multiple candidate objects as semantic knowledge |
CN111160127A (en) * | 2019-12-11 | 2020-05-15 | 中国资源卫星应用中心 | A remote sensing image processing and detection method based on a deep convolutional neural network model |
CN111695572A (en) * | 2019-12-27 | 2020-09-22 | 珠海大横琴科技发展有限公司 | Ship retrieval method and device based on convolutional layer feature extraction |
CN111640087A (en) * | 2020-04-14 | 2020-09-08 | 中国测绘科学研究院 | Image change detection method based on SAR (synthetic aperture radar) deep full convolution neural network |
CN112052736A (en) * | 2020-08-06 | 2020-12-08 | 浙江理工大学 | Cloud computing platform-based field tea tender shoot detection method |
CN112102245B (en) * | 2020-08-17 | 2024-08-20 | 清华大学 | Deep learning-based grape embryo slice image processing method and device |
CN112102245A (en) * | 2020-08-17 | 2020-12-18 | 清华大学 | Grape fetus slice image processing method and device based on deep learning |
CN112541912B (en) * | 2020-12-23 | 2024-03-12 | 中国矿业大学 | Rapid detection method and device for salient targets in mine sudden disaster scene |
CN112541912A (en) * | 2020-12-23 | 2021-03-23 | 中国矿业大学 | Method and device for rapidly detecting saliency target in mine sudden disaster scene |
CN112579816A (en) * | 2020-12-29 | 2021-03-30 | 二十一世纪空间技术应用股份有限公司 | Remote sensing image retrieval method and device, electronic equipment and storage medium |
CN112667832A (en) * | 2020-12-31 | 2021-04-16 | 哈尔滨工业大学 | Vision-based mutual positioning method in unknown indoor environment |
CN112667832B (en) * | 2020-12-31 | 2022-05-13 | 哈尔滨工业大学 | A Vision-Based Mutual Localization Method in Unknown Indoor Environment |
CN114764822A (en) * | 2021-01-12 | 2022-07-19 | 中国移动通信有限公司研究院 | Image processing method and device and electronic equipment |
CN114764822B (en) * | 2021-01-12 | 2025-05-09 | 中国移动通信有限公司研究院 | Image processing method, device and electronic equipment |
CN112801192B (en) * | 2021-01-26 | 2024-03-19 | 北京工业大学 | Extended LargeVis image feature dimension reduction method based on deep neural network |
CN112801192A (en) * | 2021-01-26 | 2021-05-14 | 北京工业大学 | Extended LargeVis image feature dimension reduction method based on deep neural network |
CN112926667B (en) * | 2021-03-05 | 2022-08-30 | 中南民族大学 | Method and device for detecting saliency target of depth fusion edge and high-level feature |
CN112926667A (en) * | 2021-03-05 | 2021-06-08 | 中南民族大学 | Method and device for detecting saliency target of depth fusion edge and high-level feature |
CN113205481A (en) * | 2021-03-19 | 2021-08-03 | 浙江科技学院 | Salient object detection method based on stepped progressive neural network |
CN113326926A (en) * | 2021-06-30 | 2021-08-31 | 上海理工大学 | Fully-connected Hash neural network for remote sensing image retrieval |
CN113326926B (en) * | 2021-06-30 | 2023-05-09 | 上海理工大学 | A fully connected hashing neural network for remote sensing image retrieval |
CN115292530A (en) * | 2022-09-30 | 2022-11-04 | 北京数慧时空信息技术有限公司 | Remote sensing image overall management system |
CN116089646A (en) * | 2023-01-04 | 2023-05-09 | 武汉理工大学 | Unmanned aerial vehicle image hash retrieval method based on saliency capture mechanism |
CN116089646B (en) * | 2023-01-04 | 2025-07-18 | 武汉理工大学 | A hash retrieval method for drone images based on saliency capture mechanism |
CN116894100A (en) * | 2023-07-24 | 2023-10-17 | 北京和德宇航技术有限公司 | Remote sensing image display control method, device and storage medium |
CN116894100B (en) * | 2023-07-24 | 2024-04-09 | 北京和德宇航技术有限公司 | Remote sensing image display control method, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106909924B (en) | 2020-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106909924B (en) | Remote sensing image rapid retrieval method based on depth significance | |
CN111931684B (en) | A weak and small target detection method based on discriminative features of video satellite data | |
Jia et al. | A semisupervised Siamese network for hyperspectral image classification | |
CN109948425B (en) | A pedestrian search method and device based on structure-aware self-attention and online instance aggregation and matching | |
Xia et al. | AID: A benchmark data set for performance evaluation of aerial scene classification | |
CN109598241B (en) | Recognition method of ships at sea based on satellite imagery based on Faster R-CNN | |
Ali et al. | Image retrieval by addition of spatial information based on histograms of triangular regions | |
Xu et al. | High-resolution remote sensing image change detection combined with pixel-level and object-level | |
Wulamu et al. | Multiscale road extraction in remote sensing images | |
CN114694038A (en) | High-resolution remote sensing image classification method and system based on deep learning | |
Liu et al. | Survey of road extraction methods in remote sensing images based on deep learning | |
CN109993072A (en) | Low-resolution pedestrian re-identification system and method based on super-resolution image generation | |
CN106844739B (en) | A Retrieval Method of Remote Sensing Image Change Information Based on Neural Network Co-training | |
CN106886785A (en) | A kind of Aerial Images Fast Match Algorithm based on multi-feature Hash study | |
Kollapudi et al. | A New Method for Scene Classification from the Remote Sensing Images. | |
CN116824485A (en) | A deep learning-based small target detection method for disguised persons in open scenes | |
CN116363526A (en) | MROCNet model construction and multi-source remote sensing image change detection method and system | |
CN114998688A (en) | A large field of view target detection method based on improved YOLOv4 algorithm | |
CN109583371A (en) | Landmark information based on deep learning extracts and matching process | |
Verma et al. | Intelligence embedded image caption generator using LSTM based RNN model | |
CN115497006B (en) | Urban remote sensing image change depth monitoring method and system based on dynamic mixing strategy | |
Codella et al. | Towards large scale land-cover recognition of satellite images | |
CN108960005A (en) | The foundation and display methods, system of subjects visual label in a kind of intelligent vision Internet of Things | |
CN116863327B (en) | Cross-domain small sample classification method based on cooperative antagonism of double-domain classifier | |
CN107832793A (en) | The sorting technique and system of a kind of high spectrum image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |