CN108108657A - A kind of amendment local sensitivity Hash vehicle retrieval method based on multitask deep learning - Google Patents
A kind of amendment local sensitivity Hash vehicle retrieval method based on multitask deep learning Download PDFInfo
- Publication number
- CN108108657A CN108108657A CN201711135951.XA CN201711135951A CN108108657A CN 108108657 A CN108108657 A CN 108108657A CN 201711135951 A CN201711135951 A CN 201711135951A CN 108108657 A CN108108657 A CN 108108657A
- Authority
- CN
- China
- Prior art keywords
- image
- vehicle
- hash
- function
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/584—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/325—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及计算机视觉、模式识别、信息检索、多任务学习、相似度测量、深度自编码卷积神经网络和深度学习技术在图像检索领域的应用,尤其涉及一种基于多任务深度学习的修正局部敏感哈希车辆检索方法。The invention relates to the application of computer vision, pattern recognition, information retrieval, multi-task learning, similarity measurement, deep self-encoding convolutional neural network and deep learning technology in the field of image retrieval, especially to a modified local Sensitive Hash Vehicle Retrieval Method.
背景技术Background technique
随着社会经济的快速发展,机动车在日益成为人们日常生活必不可少交通工具的同时,也成为犯罪分子和恐怖分子从事非法活动的必须工具。各省、市际高速公路和主干线、城市出入口及主要交通要道均部署了卡口设备,对过往车辆进行信息的采集,但当前的卡口一般都是基于车牌识别技术,嫌疑车辆一旦使用假牌、套牌、无牌以及不断更换车牌的方式,便可逃避现有卡口对其的跟踪和识别。With the rapid development of society and economy, motor vehicles have become an essential tool for criminals and terrorists to engage in illegal activities while increasingly becoming an indispensable means of transportation for people's daily life. All provinces and inter-city expressways and main roads, city entrances and exits, and major traffic arteries have deployed bayonet equipment to collect information from passing vehicles. However, current bayonets are generally based on license plate recognition technology. Once a suspected vehicle uses fake License plates, sets of cards, no license plates, and constant replacement of license plates can evade the tracking and identification of existing bayonets.
基于图像的车辆特征识别涉及图像处理、模式识别、计算机视觉等相关技术领域,目前国内外对于该技术的研究大致上可以分为三个方向:(1)基于牌照的车型识别,该方法仅从图像中识别出牌照信息,并没有直接分析获得车辆的类型,分类粒度粗,且对于套牌车辆更是无法辨别;(2)基于车标的车型识别,在实际应用中,由于车标过小、光线、遮挡等客观因素的存在,无法达到理想的效果;(3)基于外观特征的车型识别,该技术相比前两种方法具有较好的鲁棒性,识别的类型也更加的细化,可以精确到车辆的品牌、系列、型号、年款等。Image-based vehicle feature recognition involves image processing, pattern recognition, computer vision and other related technical fields. At present, the research on this technology at home and abroad can be roughly divided into three directions: (1) Vehicle type recognition based on license plate. The license plate information is recognized in the image, and the type of the vehicle is not directly analyzed. The classification granularity is coarse, and it is impossible to distinguish the license plate vehicles; The existence of objective factors such as light and occlusion cannot achieve the desired effect; (3) car model recognition based on appearance features, this technology has better robustness than the previous two methods, and the types of recognition are more refined. It can be accurate to the brand, series, model, year, etc. of the vehicle.
基于外观特征的车辆特征识别技术主要通过以下三个步骤来完成:车辆分割、车辆的特征提取和车辆的分类。传统车型识别的方法主要有:模板匹配法、统计模式识别法、神经网络识别方法、仿生模式(拓扑模式)的识别方法和支持向量机的方法。但是这些方法都存在各自的缺陷,无法同时满足车型分类的速度与准确度两个最重要的指标。然而,影响这两个指标最重要的因素就是提取的车辆特征和迅速地将车辆进行定位,所以特征提取及目标快速定位是整个识别过程的关键。车辆特征的提取受到诸多因素的影响,如车辆种类多但是没有明显的区别特征、车辆的移动以及摄像机的高度和角度导致车型的特征差别大、天气的影响,光照的影响等。The vehicle feature recognition technology based on appearance features is mainly completed through the following three steps: vehicle segmentation, vehicle feature extraction and vehicle classification. The traditional vehicle identification methods mainly include: template matching method, statistical pattern recognition method, neural network recognition method, bionic pattern (topological pattern) recognition method and support vector machine method. However, these methods have their own defects, and cannot meet the two most important indicators of speed and accuracy of vehicle classification at the same time. However, the most important factors affecting these two indicators are the extracted vehicle features and the rapid positioning of the vehicle, so feature extraction and rapid target positioning are the keys to the entire recognition process. The extraction of vehicle features is affected by many factors, such as many types of vehicles but no obvious distinguishing features, the movement of vehicles, and the height and angle of the camera lead to large differences in the characteristics of vehicle models, the influence of weather, and the influence of light.
深度学习技术的发展,推动了图片结构化和特征提取的能力。早期建设的卡口系统,智能分析能力弱,图片质量以及车牌识别准确率较低,经常要根据品牌型号颜色等车辆自身固有信息,从海量过车图片或视频中,人工查找目标车辆,由于一线警力有限、劳动强度大、车型种类多、光线角度不确定等因素,无法保证查找的准确性和时效性,特别是突发紧急事件,经常贻误最佳处理时机。通过使用车辆特征深度学习系统,对前端卡口或简易卡口获取的过车图片进行特征结构化分析识别,充分挖掘海量的卡口过车图片中有价值信息,不但可以提高车牌车型的准确率,而且增加了车辆特征的识别信息,实现了车辆子品牌、车身颜色、不系安全带、驾驶员接打电话、遮阳板状态等识别检测功能,对过车数据进行精细化校正,摆脱了传统单纯依靠车牌进行分析研判的单一手段,为卡口电警数据提供了更加丰富实用的车辆防控应用,可以实现对高危车辆的有效预警防控,优化警力部署进行针对性车辆排查,可以在大量涉车涉驾案件中有效锁定嫌疑车辆,提高刑事侦查效能,使治安防控手段从事后被动侦查向事前主动预警转变。The development of deep learning technology has promoted the ability of image structure and feature extraction. The bayonet system built in the early stage has weak intelligent analysis ability, low picture quality and low accuracy of license plate recognition. It is often necessary to manually search for the target vehicle from a large number of passing pictures or videos based on the inherent information of the vehicle itself, such as the brand, model and color. Due to factors such as limited police force, high labor intensity, various vehicle types, and uncertain light angles, the accuracy and timeliness of the search cannot be guaranteed, especially in emergencies, which often delay the best time to deal with them. By using the deep learning system of vehicle characteristics, the characteristic structural analysis and identification of the passing pictures obtained by the front-end bayonet or simple bayonet are carried out, and the valuable information in the massive bayonet passing pictures can be fully excavated, which can not only improve the accuracy of the license plate model , and increase the identification information of vehicle characteristics, realize the identification and detection functions of vehicle sub-brand, body color, not wearing seat belts, drivers answering and calling, sun visor status, etc., finely correct the passing data, get rid of the traditional The single method of analyzing and judging purely relying on the license plate provides a richer and more practical vehicle prevention and control application for the bayonet electric police data. Effectively lock suspect vehicles in cases involving vehicles and driving, improve the efficiency of criminal investigations, and transform public security prevention and control methods from post-passive investigations to pre-active early warnings.
中国发明专利申请号CN201510744990.4公开发明了一种基于相似度学习的车辆检索方法,给定车辆区域,获得SIFT特征点和描述后,使用聚类算法对SIFT特征离散化。为了弥补SIFT特征缺少位置信息的缺陷,进一步使用邻域内离散SIFT特征分布生成邻域特征,作为最终的特征点描述,每一个车辆图片由一批特征来表示,一对相似车辆图片的特征组成一个正样本,一对不同车辆图片的特征组成一个负样本。如此收集大量的正负样本后,利用随机森林方法进行相似度学习,获得分类器可以用来判断两个车辆是否相似,达到相似车辆检索的目的。该技术使用SIFT特征无法充分提取车辆特征。Chinese invention patent application number CN201510744990.4 discloses a vehicle retrieval method based on similarity learning. Given a vehicle area, after obtaining SIFT feature points and descriptions, a clustering algorithm is used to discretize SIFT features. In order to make up for the lack of position information of SIFT features, the discrete SIFT feature distribution in the neighborhood is further used to generate neighborhood features as the final feature point description. Each vehicle picture is represented by a batch of features, and a pair of features of similar vehicle pictures form a Positive samples, a pair of features of different vehicle images form a negative sample. After collecting a large number of positive and negative samples in this way, the random forest method is used for similarity learning, and the obtained classifier can be used to judge whether two vehicles are similar, so as to achieve the purpose of similar vehicle retrieval. This technique cannot sufficiently extract vehicle features using SIFT features.
中国发明专利申请号CN201610711333.4公开发明了一种基于大数据的车辆检索方法及装置,该方法包括:在目标车辆图像中提取目标车辆的各个车辆检验标志;根据各个车辆检验标志之间的位置关系对各个车辆检验标志进行融合,得到多个融合区域,融合区域包括至少一个所述车辆检验标志;确定各个融合区域包含的各个车辆检验标志的形状和颜色;根据车辆检验标志的数量、融合区域的数量、各个融合区域包含的车辆检验标志的数量、形状和颜色,在多个待检索车辆图像中逐层检索目标车辆。该技术仅仅使用单个特征对车辆进行检索。China Invention Patent Application No. CN201610711333.4 discloses a vehicle retrieval method and device based on big data. The method includes: extracting each vehicle inspection mark of the target vehicle from the target vehicle image; Relational fusion of each vehicle inspection mark to obtain a plurality of fusion areas, the fusion area includes at least one of the vehicle inspection marks; determine the shape and color of each vehicle inspection mark contained in each fusion area; according to the number of vehicle inspection marks, the fusion area The number, the number, shape and color of the vehicle inspection marks contained in each fusion area, and the target vehicle is retrieved layer by layer in multiple vehicle images to be retrieved. The technique uses only a single feature to retrieve vehicles.
中国发明专利申请号CN201710451957.1公开发明了一种基于机器视觉的套牌车辆检索识别系统。所述的系统主要包括车辆图像采集系统,数据库系统和检索系统,本发明提出一种基于机器视觉的套牌车辆检索识别系统,借助可疑车辆的车载装饰品,例如车辆的摆件、年检标签等特征进行嫌疑车辆的检索,通过对车载装饰品区域图像进行特征采集,并采用基于车载装饰品区域图像稀疏编码方法进行车辆检索,解决从海量交通场景图像中搜索目标车辆的问题,实现对套牌车辆的准确识别与发现。该技术在大量数据库中时间复杂度较高。China Invention Patent Application No. CN201710451957.1 discloses a machine vision-based retrieval and recognition system for licensed vehicles. The system mainly includes a vehicle image acquisition system, a database system, and a retrieval system. The present invention proposes a machine vision-based retrieval and recognition system for licensed vehicles, which uses the vehicle decorations of suspicious vehicles, such as vehicle decorations, annual inspection labels, etc. Carry out the retrieval of suspected vehicles, through the feature collection of the image of the vehicle decoration area, and use the sparse coding method based on the image of the vehicle decoration area to perform vehicle retrieval, solve the problem of searching for the target vehicle from a large number of traffic scene images, and realize the identification of licensed vehicles accurate identification and discovery. This technique has high time complexity in a large number of databases.
综上所述,利用深度自编码卷积神经网络和修正局部敏感哈希再排序方法进行以图搜图技术,目前尚存着在如下若干个棘手的问题:1)如何从复杂的背景中准确分割出被测车辆的整体图像;如何尽可能采用极少的标签图像数据来学习训练获得车辆车型的特征数据;2)如何对车型进行更加的细分类,识别出车辆的品牌、系列、车身颜色等更多的信息。另一方面是如何将车型、车牌和车标在同一深度卷积神经网络中进行并行处理,即实现深度学习的多任务的并行计算,以提高车辆身份识别水平;3)如何设计一种对车辆图像提取实例特征的方法用于相似类型车型检索;4)如何使用提取到的特征建立分层深度搜索,以获取更为精准的检索结果;5)如何减少大数据时代背景下图像检索系统存储空间消耗大、检索速度慢等问题。To sum up, there are still several thorny problems in the image search technology using the deep self-encoded convolutional neural network and the modified local sensitive hash reordering method: 1) How to accurately extract information from the complex background Segment the overall image of the vehicle under test; how to use as little label image data as possible to learn and train the characteristic data of the vehicle model; 2) how to classify the vehicle model more subdivided, and identify the brand, series, and body color of the vehicle Wait for more information. On the other hand, how to process the model, license plate and logo in parallel in the same deep convolutional neural network, that is, to realize the multi-task parallel computing of deep learning, so as to improve the level of vehicle identification; 3) how to design a vehicle The method of extracting instance features from images is used to retrieve similar types of vehicle models; 4) How to use the extracted features to establish a hierarchical deep search to obtain more accurate retrieval results; 5) How to reduce the storage space of image retrieval systems in the era of big data Large consumption, slow retrieval speed and other issues.
发明内容Contents of the invention
针对已有的车辆检索技术中自动化和智能化水平低、缺乏深度学习、难以获取精确的检索结果、检索技术存储空间消耗大,检索速度慢且难以满足大数据时代的图像检索需求等问题,本发明提出了一种基于深度自编码卷积神经网络端对端的通过分层深度搜索的车辆图像检索方法,利用深度学习方法提高了检索系统中的自动化和智能化水平同时使图像识别、特征获取、检索效率完美结合,使得整个检索系统获得了精准的检索结果,使用稀疏编码方式减少了系统对内存的依赖、加快了检索速度,从而满足大数据时代背景下的图像检索需求。Aiming at the low level of automation and intelligence in the existing vehicle retrieval technology, lack of deep learning, difficulty in obtaining accurate retrieval results, large storage space consumption of retrieval technology, slow retrieval speed and difficulty in meeting the image retrieval needs in the era of big data, this paper The invention proposes an end-to-end vehicle image retrieval method through layered depth search based on deep self-encoded convolutional neural network, which improves the automation and intelligence level of the retrieval system by using deep learning methods and simultaneously enables image recognition, feature acquisition, The perfect combination of retrieval efficiency enables the entire retrieval system to obtain accurate retrieval results, and the use of sparse coding reduces the system's dependence on memory and speeds up retrieval, thereby meeting the image retrieval needs in the era of big data.
为了解决上述技术问题,本发明提供如下的技术方案:In order to solve the above technical problems, the present invention provides the following technical solutions:
一种基于多任务深度学习的修正局部敏感哈希车辆检索方法,包括以下步骤:A modified local sensitive hash vehicle retrieval method based on multi-task deep learning, comprising the following steps:
1)构建用于深度学习和训练识别的多任务端到端的卷积神经网络,训练数据和逐层递进的网络结构深入地学习车辆各种属性信息,包括车型、车系、车标、颜色和车牌;1) Construct a multi-task end-to-end convolutional neural network for deep learning and training recognition. The training data and layer-by-layer progressive network structure can deeply learn various attribute information of vehicles, including model, car series, car logo, color and license plate;
2)利用步骤1)的多任务卷积神经网络采用分段并行学习和编码策略构建车辆属性哈希码;2) Utilize the multi-task convolutional neural network of step 1) to adopt segmented parallel learning and encoding strategies to construct vehicle attribute hash codes;
3)利用金字塔池化层和向量压缩层构建特征金字塔模块,以适应不同尺寸的卷积特征图输入提取车辆的实例特征;3) Use the pyramid pooling layer and the vector compression layer to construct a feature pyramid module to adapt to the convolution feature map input of different sizes to extract the instance features of the vehicle;
4)利用步骤3)得到的实例特征构建局部敏感再排序算法;4) Using the instance features obtained in step 3) to construct a locally sensitive reordering algorithm;
5)构建在无法获取检索车辆图像情况下的跨模态检索方法,实现车辆检索。5) Construct a cross-modal retrieval method under the condition that the retrieved vehicle image cannot be obtained to realize vehicle retrieval.
进一步,所述用于深度学习和训练识别的多任务端到端的卷积神经网络含有共享卷积模块,感兴趣区域坐标回归和识别模块,多任务学习模块,实例特征提取模块;Further, the multi-task end-to-end convolutional neural network for deep learning and training recognition contains a shared convolution module, a region of interest coordinate regression and recognition module, a multi-task learning module, and an instance feature extraction module;
共享卷积模块:共享网络由5个卷积模块组成,其中conv2_x到conv5_x的最后一层分别为{42,82,162,162}作为特征图的输出尺寸,conv1作为输入层只含有单层卷积层;Shared convolution module: The shared network consists of 5 convolution modules, where the last layer of conv2_x to conv5_x is {4 2 ,8 2 ,16 2 ,16 2 } as the output size of the feature map, conv1 as the input layer only Contains a single convolutional layer;
在共享卷积模块之后连接感兴趣区域坐标回归和识别模块,此模块可将任意大小的图像作为输入,输出目标区域的矩形预测框的集合,包含了每个预测框的位置坐标和数据集中类别的概率得分,为了生成区域建议框,首先输入图像经过卷积共享层生成特征图,然后在特征图上进行多尺度卷积操作,实现过程为:在每一个滑动窗口的位置使用3种尺度和3种长宽比,以当前滑动窗口中心为中心,并对应一种尺度和长宽比,则可以在原图上映射得到9种不同尺度的候选区域;如对于大小为w×h的共享卷积特征图,则总共有w×h×9个候选区域;最后,分类层输出w×h×9×2个候选区域的得分,即对每个区域是目标/非目标的估计概率,回归层输出w×h×9×4个参数,即候选区域的坐标参数;After the shared convolution module, the coordinate regression and recognition module of the region of interest is connected. This module can take an image of any size as input, and output a set of rectangular prediction frames of the target area, including the position coordinates of each prediction frame and the category in the data set. The probability score of , in order to generate a region proposal box, first input the image through the convolution shared layer to generate a feature map, and then perform a multi-scale convolution operation on the feature map. The implementation process is: use 3 scales and 3 kinds of aspect ratios, centered on the center of the current sliding window, and corresponding to one scale and aspect ratio, then 9 candidate regions of different scales can be mapped on the original image; for example, for a shared convolution with a size of w×h feature map, there are a total of w×h×9 candidate regions; finally, the classification layer outputs the scores of w×h×9×2 candidate regions, that is, the estimated probability of each region being a target/non-target, and the regression layer outputs w×h×9×4 parameters, that is, the coordinate parameters of the candidate area;
训练RPN网络时,给每个候选区域分配一个二进制标签,以此来标注该区域是否是对象目标,操作如下:1)与某个真正目标区域(Ground Truth,GT)有最高的IoU(Intersection-over-Union,交集并集之比)重叠候选区域;2)与任意GT包围盒有大于0.7的IoU交叠的候选区域。分配负标签给与所有GT包围盒的IoU比率都低于0.3的候选区域;3)介于两者之间的舍弃。When training the RPN network, assign a binary label to each candidate area to mark whether the area is an object target. The operation is as follows: 1) It has the highest IoU (Intersection- over-Union, the ratio of intersection and union) overlapping candidate regions; 2) candidate regions that have an IoU overlap greater than 0.7 with any GT bounding box. Assign negative labels to candidate regions whose IoU ratios to all GT bounding boxes are lower than 0.3; 3) discard in between.
有了这些定义,最小化目标函数。对一个图像的损失函数定义为:With these definitions, the objective function is minimized. The loss function for an image is defined as:
其中,i是第i个候选区域的索引,Pi是候选区域是第i类的概率。如果候选区域的标签为正,为1,如果候选区域标签为0,就是0。ti是一个向量,表示预测的包围盒的4个参数化坐标,是对应的GT包围盒的坐标向量。Ncls和Nreg分别为分类损失函数与位置回归损失函数的归一化系数,λ为两者之间的权重参数,分类损失函数Lcls是两个类别的对数损失,两个类别为目标和非目标:where i is the index of the ith candidate region, and P i is the probability that the candidate region is the ith class. If the label of the candidate region is positive, is 1, if the candidate region label is 0, It is 0. t i is a vector representing the 4 parameterized coordinates of the predicted bounding box, is the coordinate vector of the corresponding GT bounding box. N cls and N reg are the normalization coefficients of the classification loss function and the position regression loss function respectively, λ is the weight parameter between the two, the classification loss function L cls is the logarithmic loss of the two categories, and the two categories are the target and non-target:
对于位置回归损失函数Lreg,用以下函数定义:For the positional regression loss function L reg , the following function is defined:
其中,R是鲁棒的损失函数(smooth L1)。Among them, R is a robust loss function (smooth L1).
然而,训练一个多任务深度学习网络并非是一件容易实现的过程,因为不同任务级别的信息有着各自不同的学习难点和收敛速度,因此,设计一个良好的多任务目标函数是至关重要的。多任务联合训练过程如下:假设,总任务数为T,对于第t个任务的训练数据记为其中t∈(1,T),i∈(1,N),N为总训练样本数,分别为第i样本的特征向量和标注标签,那么多任务目标函数则表示为:However, training a multi-task deep learning network is not an easy process because different task-level information has different learning difficulties and convergence speeds. Therefore, it is crucial to design a good multi-task objective function. The multi-task joint training process is as follows: Assume that the total number of tasks is T, and the training data for the t-th task is recorded as Where t∈(1,T), i∈(1,N), N is the total number of training samples, are the feature vector and label of the i-th sample respectively, then the multi-task objective function is expressed as:
式中是输入特征向量和权重参数wt的映射函数,L(·)为损失函数,Φ(wt)为权重参数的正则化值;In the formula is the input feature vector and the mapping function of the weight parameter w t , L( ) is the loss function, and Φ(w t ) is the regularization value of the weight parameter;
对于损失函数,利用softmax配合对数似然代价函数训练最后一层的特征,实现图像分类。softmax损失函数定义如下:For the loss function, the features of the last layer are trained using softmax and the log-likelihood cost function to achieve image classification. The softmax loss function is defined as follows:
式中,xi为第i深度特征,Wj为最后一个全连接层中权重的第j列,b是偏置项,m,n分别为处理样本数量与类别数;In the formula, x i is the i-th depth feature, W j is the j-th column of the weight in the last fully connected layer, b is the bias item, m, n are the number of processing samples and the number of categories respectively;
卷积神经网络训练是一个反向传播过程,与BP算法类似,通过误差函数反向传播,利用随机梯度下降法对卷积参数和偏置进行优化调整,直到网络收敛或者达到最大迭代次数停止;Convolutional neural network training is a backpropagation process, similar to the BP algorithm, through the backpropagation of the error function, using the stochastic gradient descent method to optimize and adjust the convolution parameters and bias until the network converges or reaches the maximum number of iterations to stop;
该神经网络训练是一个反向传播过程,通过误差函数反向传播,利用随机梯度下降法对卷积参数和偏置进行优化调整,直到网络收敛或者达到最大迭代次数停止;The neural network training is a backpropagation process, through the backpropagation of the error function, the convolution parameters and bias are optimized and adjusted by the stochastic gradient descent method, until the network converges or reaches the maximum number of iterations to stop;
反向传播需要通过对带有标签的训练样本进行比较,采用平方误差代价函数,对于c个类别,N个训练样本的多类别进行识别,网络最终输出误差函数用公式(7)来计算误差,Backpropagation needs to compare the training samples with labels, and use the square error cost function to identify multiple categories of c categories and N training samples. The final output error function of the network uses formula (7) to calculate the error.
式中,EN为平方误差代价函数,为第n个样本对应标签的第k维,为第n个样本对应网络预测的第k个输出;In the formula, E N is the square error cost function, is the k-th dimension of the label corresponding to the n-th sample, The nth sample corresponds to the kth output predicted by the network;
对误差函数进行反向传播时,采用传统的BP算法类似的计算方法,具体公式形式,如公式(8)所示,When backpropagating the error function, a calculation method similar to the traditional BP algorithm is used, and the specific formula form is shown in formula (8),
δl=(Wl+1)Tδl+1×f'(ul)(ul=Wlxl-1+bl) (8)δ l =(W l+1 ) T δ l+1 ×f'(u l )(u l =W l x l-1 +b l ) (8)
式中,δl代表当前层的误差函数,δl+1代表上一层的误差函数,Wl+1为上一层映射矩阵,f'表示激活函数的反函数,即上采样,ul表示未通过激活函数的上一层的输出,xl-1表示下一层的输入,Wl为本层映射权值矩阵。In the formula, δ l represents the error function of the current layer, δ l+1 represents the error function of the previous layer, W l+1 is the mapping matrix of the previous layer, f' represents the inverse function of the activation function, that is, upsampling, u l Represents the output of the previous layer that has not passed the activation function, x l-1 represents the input of the next layer, and W l is the mapping weight matrix of this layer.
再进一步,多任务在学习过程中存在任务之间的关联性,即任务之间存在信息共享,在同时训练多个任务时,网络利用任务之间的共享信息增强系统的归纳偏置能力和分类器的泛化能力;所述多任务网络通过在感兴趣区域模块后添加五个全连接层分为五个子任务,每个全连接后连接softmax激活函数将阈值归一化在[0,1]之间,再将归一化后的值送入分段函数促进二进制码的输出,通过分段学习和编码策略降低哈希码之间的冗余性从而增强学习到的特征的鲁棒性;Furthermore, there is a correlation between tasks in the multi-task learning process, that is, there is information sharing between tasks. When training multiple tasks at the same time, the network uses the shared information between tasks to enhance the system's inductive bias ability and classification. The generalization ability of the device; the multi-task network is divided into five subtasks by adding five fully connected layers after the region of interest module, and each fully connected softmax activation function normalizes the threshold in [0,1] In between, the normalized value is sent to the segmentation function to promote the output of the binary code, and the redundancy between the hash codes is reduced through segmentation learning and coding strategies to enhance the robustness of the learned features;
将多任务学习网络分为T个任务,每个任务中含有ct个类别,每个任务的全连接层一维向量输出使用mt表示;首先利用softmax激活函数将全连接层的输出归一化在[0,1]之间,公式具体表现形式如下:The multi-task learning network is divided into T tasks, each task contains c t categories, and the one-dimensional vector output of the fully connected layer of each task is represented by m t ; firstly, the output of the fully connected layer is normalized by using the softmax activation function between [0,1], the specific expression of the formula is as follows:
其中θ表示随机超平面;得到归一化后的值送入阈值分段函数进行二值化,得到全连接层的二进制输出,公式具体表现形式如下:Among them, θ represents a random hyperplane; the normalized value is sent to the threshold segmentation function for binarization, and the binary output of the fully connected layer is obtained. The specific expression of the formula is as follows:
最后为了获取多任务卷积网络分段并行学习到的车辆属性哈希码,对公式(10)得到的Ht向量再次进行一定比例的融合,使用向量fA表示,公式具体表现形式如下:Finally, in order to obtain the vehicle attribute hash code learned in parallel by the multi-task convolutional network, the H t vector obtained by the formula (10) is fused again in a certain proportion, and expressed by the vector f A. The specific expression of the formula is as follows:
fA=[α1H1;α2H2;...;αtHt] (11)f A = [α 1 H 1 ; α 2 H 2 ; . . . ; α t H t ] (11)
公式(11)中的αt具体表现形式如下:The specific form of α t in formula (11) is as follows:
在每个Ht之前乘上惩罚因子αt弥补了不同任务之间由于不同分类数造成的误差。Multiplying the penalty factor α t before each H t makes up for the error caused by different classification numbers between different tasks.
更进一步,手工设计的功能时代大量使用了形象化的图像金字塔,以至于像DPM这样的物体检测器需要高密度的采样才能获得好的结果(例如每倍频10个音阶);对于识别任务,工程特征已经被深度卷积网络计算的特征大部分所取代。深度卷积网络除了能够表示更高级别的语义之外,还具有更强大的尺度变异性,从而有助于从单一输入尺度上计算的特征中识别出来;但是,即使有这种鲁棒性,金字塔仍然需要得到最准确的结果;在ImageNet和COCO检测挑战中最近的所有最重要的条目都使用了对特征化图像金字塔的多尺度测试;使图像金字塔的每个级别具有特征的主要优点在于它产生了多级别的特征表示,其中所有级别在语义上都是强的,包括高分辨率级别;Furthermore, the age of hand-designed features makes heavy use of figurative image pyramids, so that object detectors like DPM require high-density sampling to achieve good results (e.g., 10 octaves); for recognition tasks, Engineered features have been mostly replaced by features computed by deep convolutional networks. In addition to being able to represent higher-level semantics, deep convolutional networks have more robust scale variability, thereby facilitating recognition from features computed at a single input scale; however, even with this robustness, Pyramids are still needed to get the most accurate results; all recent top entries in the ImageNet and COCO detection challenges use multi-scale testing of featurized image pyramids; the main advantage of having features at each level of an image pyramid is that it A multi-level feature representation is produced, where all levels are semantically strong, including the high-resolution level;
利用卷积特征层次结构的金字塔形状,同时创建一个在所有尺度上都具有强大语义的特征金字塔;为了实现这个目标依靠一种结构将低分辨率,语义上强大的特征与高分辨率,语义上弱的特征通过自顶向下的路径和横向连接相结合,并且可以从单个输入图像比例快速构建,可以用来代替特征化的图像金字塔而不牺牲代表性的特征,速度或内存;为了得到车辆图像的实例特征并且适应任意尺寸的卷积特征图的输入,选择共享模块conv2_x到conv5_x每个单元的最后一层并结合感兴趣区域模块的输出,再添加一个金字塔池化层和向量压缩层将三维特征压缩成一个一维特征向量,这样选择是因为即可丰富特征金字塔得到的特征图信息,每个阶段的最深层又有最强大的特征表示功能;Utilizes the pyramid shape of the convolutional feature hierarchy while simultaneously creating a feature pyramid that is semantically strong at all scales; to achieve this goal relies on a structure that combines low-resolution, semantically powerful features with high-resolution, semantically Weak features are combined via top-down paths and lateral connections, and can be quickly constructed from a single input image scale, and can be used instead of a featurized image pyramid without sacrificing representative features, speed, or memory; in order to obtain vehicle The instance features of the image and adapt to the input of the convolution feature map of any size, select the last layer of each unit of the shared module conv2_x to conv5_x and combine the output of the region of interest module, and then add a pyramid pooling layer and a vector compression layer will be The three-dimensional features are compressed into a one-dimensional feature vector. This choice is because the feature map information obtained by the feature pyramid can be enriched, and the deepest layer of each stage has the most powerful feature representation function;
将每个模块的最后一层作为特征金字塔的输入,对于上述定义的网络conv2_x到conv5_x的最后一层依次选择{42,82,162,162}作为特征金字塔的输入特征图大小;用I表示输入图像,其长宽分别用字母h,w表示,第x层的共享卷积模块用convx_x表示,输入图像I后被激活为一个三维的特征向量T,维度为h′×w′×d是一系列二维特征图的集合,二维特征图长宽为h′×w′,T中含有d个二维特征图,用集合S=S{Sn},n∈(1,d)表示,Sn对应为第n个通道特征图;接着将三维的特征向量T送入特征金字塔,经过多个尺度卷积核卷积过后得到三维特征向量T′,维度为l×l×d,同样包含一组二维特征图,可用S′=S′{S′n},n∈(1,d)表示,其中Sn′对应为第n个通道特征图,每个特征图大小为l×l,一共包含d个;然后利用k×k大小的滑动窗口并且选择最大池化对特征图进行逻辑回归,得到一组l/k×l/k大小的特征图,再对每个通道的Sn′进行融合得到一维向量,依次对d个通道进行相同操作,最后得到的个性特征向量fB大小为(1,l/k×d)。最终的检索特征向量f如公式(13)所示:Take the last layer of each module as the input of the feature pyramid, and select {4 2 , 8 2 , 16 2 , 16 2 } as the input feature map size of the feature pyramid for the last layer of the network conv2_x to conv5_x defined above; Use I to represent the input image, and its length and width are represented by letters h and w respectively. The shared convolution module of the xth layer is represented by convx_x. After the input image I is activated into a three-dimensional feature vector T, the dimension is h′×w′ ×d is a set of a series of two-dimensional feature maps, the length and width of the two-dimensional feature maps are h′×w′, T contains d two-dimensional feature maps, and the set S=S{S n },n∈(1, d) indicates that S n corresponds to the feature map of the nth channel; then the three-dimensional feature vector T is sent into the feature pyramid, and the three-dimensional feature vector T′ is obtained after convolution with multiple scale convolution kernels, and the dimension is l×l× d, also contains a set of two-dimensional feature maps, which can be represented by S′=S′{S′ n }, n∈(1,d), where S n ′ corresponds to the nth channel feature map, and the size of each feature map It is l×l, including d in total; then use the sliding window of k×k size and select the maximum pooling to perform logistic regression on the feature map to obtain a set of feature maps of l/k×l/k size, and then for each Channel S n ′ is fused to obtain a one-dimensional vector, and the same operation is performed on d channels in turn, and the finally obtained personality feature vector f B has a size of (1,l/k×d). The final retrieval feature vector f is shown in formula (13):
f=[fA;fB] (13)。f=[f A ; f B ] (13).
局部敏感哈希算法的基本思想是:将原始数据空间中的两个相邻数据点通过相同的映射或投影变换后,这两个数据点在新的数据空间中仍然相邻的概率很大,而不相邻的数据点被映射到同一个桶的概率很小。也就是说,如果我们对原始数据进行一些哈希映射后,我们希望原先相邻的两个数据能够被哈希到相同的桶内,具有相同的桶号。对原始数据集合中所有的数据都进行哈希映射后,这样就得到了一个哈希表,这些原始数据集被分散到了哈希表的桶内,每个桶会落入一些原始数据,属于同一个桶内的数据就有很大可能是相邻的,当然也存在不相邻的数据被哈希到了同一个桶内。因此,如果能够找到这样一些哈希函数,使得经过它们的哈希映射变换后,原始空间中相邻的数据落入相同的桶内的话,那么在该数据集合中进行近邻查找就变得容易了,只需要将查询数据进行哈希映射得到其桶号,然后取出该桶号对应桶内的所有数据,再进行线性匹配即可查找到与查询数据相邻的数据。换句话说,通过哈希函数映射变换操作,将原始数据集合分成了多个子集合,而每个子集合中的数据间是相邻的且该子集合中的元素个数较小,因此将一个在超大集合内查找相邻元素的问题转化为了在一个很小的集合内查找相邻元素的问题,这种算法能使得查找计算量大幅度下降;The basic idea of locality-sensitive hashing algorithm is: After transforming two adjacent data points in the original data space through the same mapping or projection, the probability that these two data points are still adjacent in the new data space is very high. The probability that non-adjacent data points are mapped to the same bucket is very small. That is to say, if we perform some hash mapping on the original data, we hope that two adjacent data can be hashed into the same bucket with the same bucket number. After hash mapping is performed on all the data in the original data set, a hash table is obtained. These original data sets are scattered into the buckets of the hash table, and each bucket will fall into some original data belonging to the same bucket. The data in a bucket is likely to be adjacent. Of course, there are also non-adjacent data that are hashed into the same bucket. Therefore, if some hash functions can be found so that after their hash map transformation, the adjacent data in the original space fall into the same bucket, then it becomes easy to search for neighbors in the data set , you only need to perform hash mapping on the query data to obtain its bucket number, then take out all the data in the bucket corresponding to the bucket number, and then perform linear matching to find the data adjacent to the query data. In other words, through the hash function mapping transformation operation, the original data set is divided into multiple sub-sets, and the data in each sub-set is adjacent and the number of elements in the sub-set is small, so one in The problem of finding adjacent elements in a very large set is transformed into the problem of finding adjacent elements in a small set. This algorithm can greatly reduce the amount of search calculations;
对于原本相邻的两个数据点经过哈希变换后落入相同桶内的哈希函数需要满足以下两个条件:For the hash function that two adjacent data points fall into the same bucket after hash transformation, the following two conditions need to be met:
如果d(x,y)≤d1,则h(x)=h(y)的概率至少为p1;If d(x,y)≤d1, then h(x)=h(y) with probability at least p1;
如果d(x,y)≥d2,则h(x)=h(y)的概率至多为p2;If d(x,y)≥d2, then the probability of h(x)=h(y) is at most p2;
其中d(x,y)表示x和y之间的距离,d1<d2,h(x)和h(y)分别表示对x和y进行哈希变换。Where d(x,y) represents the distance between x and y, d1<d2, h(x) and h(y) represent the hash transformation of x and y, respectively.
满足以上两个条件的哈希函数称为(d1,d2,p1,p2)-敏感。而通过一个或多个(d1,d2,p1,p2)-敏感的哈希函数对原始数据集合进行哈希生成一个或多个哈希表的过程称为局部敏感哈希。A hash function that satisfies the above two conditions is called (d1,d2,p1,p2)-sensitive. The process of generating one or more hash tables by hashing the original data set through one or more (d1, d2, p1, p2)-sensitive hash functions is called locality-sensitive hashing.
使用局部敏感哈希进行对海量数据建立索引,即哈希表并通过索引来进行近似最近邻查找的过程如下:The process of using local sensitive hash to index massive data, that is, hash table and perform approximate nearest neighbor search through index is as follows:
离线建立索引Offline indexing
(1)选取满足(d1,d2,p1,p2)-敏感的局部敏感哈希的哈希函数;(1) Select a hash function that satisfies (d1,d2,p1,p2)-sensitive locality-sensitive hashing;
(2)根据对查找结果的准确率,即相邻的数据被查找到的概率来确定哈希表的个数L,每个哈希表内的哈希函数的个数K,以及跟局部敏感哈希的哈希函数自身有关的参数;(2) Determine the number L of hash tables, the number K of hash functions in each hash table, and local sensitivity according to the accuracy of the search results, that is, the probability that adjacent data is found parameters related to the hash function itself;
(3)将所有数据经过局部敏感哈希的哈希函数哈希到相应的桶内,构成了一个或多个哈希表;(3) Hash all the data into corresponding buckets through the hash function of local sensitive hashing, forming one or more hash tables;
在线查找find online
(1)将查询数据经过局部敏感哈希的哈希函数哈希得到相应的桶号;(1) Hash the query data through the hash function of local sensitive hash to obtain the corresponding bucket number;
(2)将桶号中对应的数据取出;为了保证查找速度,只取出前2L个数据;(2) Take out the corresponding data in the bucket number; in order to ensure the search speed, only take out the first 2L data;
(3)计算查询数据与这2L个数据之间的相似度或距离,返回最近邻的数据;(3) Calculate the similarity or distance between the query data and the 2L data, and return the nearest neighbor data;
局部敏感哈希在线查找时间由两个部分组成:①通过局部敏感哈希的哈希函数计算哈希值,即计算桶号的时间;②将查询数据与桶内的数据进行比较计算的时间。因此,局部敏感哈希的查找时间至少是一个次线性时间。这是因为这里通过对桶内的属于建立索引来加快匹配速度,这时第②部分的耗时就从O(N)变成了O(logN)或O(1),极大的减少了计算量。The local sensitive hash online search time consists of two parts: ① the time to calculate the hash value through the hash function of the local sensitive hash, that is, the time to calculate the bucket number; ② the time to compare the query data with the data in the bucket. Therefore, the lookup time of locality-sensitive hashing is at least a sublinear time. This is because the matching speed is accelerated by indexing the belongings in the bucket. At this time, the time consumption of part ② changes from O(N) to O(logN) or O(1), which greatly reduces the calculation quantity.
所述的局部敏感哈希的一个关键是:将相似的样本映射到同一个具有高概率的同一个桶;局部敏感哈希的哈希函数h(.)满足以下条件:A key of the locality-sensitive hashing is: to map similar samples to the same bucket with high probability; the hash function h(.) of the locality-sensitive hashing satisfies the following conditions:
s{h(fAq)=h(fA)}=sim(fAq,fA) (14)s{h(f Aq )=h(f A )}=sim(f Aq ,f A ) (14)
式中,sim(fAq,fA)表示fAq与fA的相似度,h(fA)表示fA的哈希函数,h(fAq)表示fA的哈希函数,,其中的相似性度量与一个距离函数σ直接关联,如:In the formula, sim(f Aq , f A ) represents the similarity between f Aq and f A , h(f A ) represents the hash function of f A , h(f Aq ) represents the hash function of f A , where The similarity measure is directly related to a distance function σ, such as:
局部敏感哈希函数的典型分类由随机投影和阈值给出,如式(16)所示,A typical classification of locality-sensitive hash functions is given by random projections and thresholds, as shown in Equation (16),
h(fA)=sign(WfA+b) (16)h(f A )=sign(Wf A +b) (16)
式中,W是一个随机超平面向量,b是一个随机截距。where W is a random hyperplane vector and b is a random intercept.
所述的局部敏感哈希由预处理算法和最近邻搜索算法构成,通过这两个算法处理将将搜索图像特征表示成一串固定长度的二值编码;The locality-sensitive hash is composed of a preprocessing algorithm and a nearest neighbor search algorithm, and the search image features will be represented as a string of fixed-length binary codes through these two algorithm processing;
所述预处理算法的过程为:The process of the preprocessing algorithm is:
输入一组提取的图像特征p和哈希表的数l1,使用随机哈希函数g(.)对图像特征进行映射,将点pj存储到哈希表Ti相应的桶号gi(pj)中;输出哈希表Ti,i=1,…,l1;Input a set of extracted image features p and the number l 1 of the hash table, use the random hash function g(.) to map the image features, and store the point p j in the corresponding bucket number g i ( p j ); output hash table T i , i=1,...,l 1 ;
所述最近邻搜索算法的过程为:The process of the nearest neighbor search algorithm is:
输入一个检索图像特征q,访问由预处理算法所生成的哈希表Ti,i=1,…,l1最近邻的数目K,返回检索点q在数据集S中的K个最近邻数据;Input a retrieval image feature q, access the hash table T i generated by the preprocessing algorithm, i=1,...,l 1 the number K of the nearest neighbors, and return the K nearest neighbor data of the retrieval point q in the data set S ;
设Γ={I1,I2,…,In}为检索的由n个图像构成的数据集,其每幅图像所对应的二进制代码为ΓH={H1,H2,…,Hn},Hi∈{0,1}h;给定搜索图像Iq和二进制代码Hq,将Hq与Hi∈ΓH之间的汉明距离小于阈值TH的那些图像放入到候选池P中,为候选图像。Let Γ={I 1 ,I 2 ,…,In } be the retrieved data set consisting of n images, and the binary code corresponding to each image is Γ H ={H 1 ,H 2 ,…,H n }, H i ∈ {0,1} h ; given a search image I q and a binary code H q , put those images whose Hamming distance between H q and H i ∈ Γ H is smaller than the threshold T H into In the candidate pool P, for candidate images.
利用实例特征构建局部敏感再排序算法;在传统的局部敏感哈希算法中,返回的主要是距离上相近的图像,即检索图像和候选池中的图像相似度接近1;主要是因为通过低维的车辆属性哈希码映射后可得到相同型号的车辆,但是相同型号下的车辆仍存某些难以区分的情况,从人的主观判断上存在明显区别,但是仅仅通过车辆属性哈希码不能有效区别这些差异;为了排查出候选池中与检索图片含有相同个性特征的车辆,在检索图像经过车辆属性哈希码映射到各个桶中后,再利用获取到的图像实例特征,对桶中的图片再次进行排序缩小类内误差,再排序公式表现形式如下:Using instance features to construct a local-sensitive reordering algorithm; in the traditional local-sensitive hashing algorithm, the returned images are mainly similar in distance, that is, the similarity between the retrieved image and the image in the candidate pool is close to 1; mainly because of the low-dimensional Vehicles of the same model can be obtained after mapping the vehicle attribute hash code, but there are still some indistinguishable situations between vehicles of the same model. There are obvious differences in human subjective judgment, but only through the vehicle attribute hash code cannot be effective Distinguish these differences; in order to find out the vehicles in the candidate pool that have the same personality characteristics as the retrieved pictures, after the retrieved images are mapped to each bucket through the vehicle attribute hash code, and then use the obtained image instance features to compare the images in the bucket Sorting is performed again to reduce the intra-class error, and the expression of the re-sorting formula is as follows:
在公式(17)中,k表示经过车辆属性哈希码映射选出的桶中第k个图像,表示惩罚因子且cos表示度量图像实例特征的余弦距离公式;为了排除车辆属性哈希码的错误映射,y表示映射前检索图像fAq与桶中图像的类型是否相等,如果相等则y为1,否则为0;In formula (17), k represents the kth image in the bucket selected by the vehicle attribute hash code mapping, represents the penalty factor and cos represents the cosine distance formula for measuring the characteristics of image instances; in order to eliminate the wrong mapping of vehicle attribute hash codes, y represents the retrieval image f Aq before mapping and the image in the bucket Whether the types of are equal, if they are equal, y is 1, otherwise it is 0;
在进一步排序中,已经将Hq与Hi∈ΓH之间的汉明距离小于阈值TH的那些图像放入到候选池P中,为了得到更为精准的搜索结果,本发明在候选池基础上进一步采用再排序方法;In further sorting, those images whose Hamming distance between H q and H i ∈ Γ H are smaller than the threshold T H have been put into the candidate pool P. In order to obtain more accurate search results, the present invention selects On the basis of further adopting the reordering method;
再排序方法,给定搜索图像Iq和候选池P,使用实例特征来确定从候选池P中图像的前k个排名图像;使用公式(17)计算它们之间的相似程度,The reranking method, given a search image I q and a candidate pool P, uses instance features to determine the top-k ranked images from the images in the candidate pool P; the similarity between them is calculated using Equation (17),
进一步,关于再排序的评价,使用一个以排名为基础的标准来进行评价;对于给定一个搜索图像Iq和一个相似性度量,对每个数据集图像进行一个排名;这里用评估前k个排名图像来表示一个搜索图像Iq的检索精度,用公式(18)表示;Further, regarding the evaluation of re-ranking, a ranking-based criterion is used for evaluation; given a search image I q and a similarity measure, a ranking is performed for each dataset image; here, the top k Ranking images to represent the retrieval accuracy of a search image I q , expressed by formula (18);
式中Rel(i)表示搜索图像Iq与第i个排名图像之间的真实相关,k表示排名图像的个数,Precision@k搜索精度;在计算真实相关时,只考虑有分类标签部分,Rel(i)∈{0,1},如果搜索图像与第i个排名图像都具有相同的标签设置Rel(i)=1,否则设置Rel(i)=0,遍历候选池P中前k个排名图像就能得到搜索精度;In the formula, Rel(i) represents the true correlation between the search image I q and the i-th ranked image, k represents the number of ranked images, Precision@k search accuracy; when calculating the true correlation, only the part with classification labels is considered, Rel(i) ∈ {0,1}, if the search image has the same label as the i-th ranked image, set Rel(i)=1, otherwise set Rel(i)=0, traverse the top k candidates in the pool P Rank images to get search accuracy;
所述步骤5)中,在无法获取检索的图像时,采用文本检索方式进行辅助检索,在不增加额外训练的情况下,使得通过文本得到的检索特征和卷积网络得到的文本特征可共用一套检索方式,其文本获取特征方法如下:In the step 5), when the retrieved image cannot be obtained, the text retrieval method is used for auxiliary retrieval, and without additional training, the retrieval features obtained through the text and the text features obtained by the convolutional network can be shared. Set retrieval method, its text acquisition feature method is as follows:
初始化:文本文件解析成词条向量;去除小词、重复词;检查词条确保解析的正确性;Initialization: parse the text file into entry vectors; remove small words and repeated words; check the entries to ensure the correctness of the analysis;
5.1)从输入文本O中取出随机组合的分词最小向量R=(r1,r2,...,rn);5.1) Take out the word segmentation minimum vector R=(r 1 ,r 2 ,...,r n ) of random combination from the input text O;
5.2)对R与fA顺序及车辆属性哈希码整合,得到文本属性特此时的fATxt维度小于R的维度;5.2) Integrate R and f A sequence and vehicle attribute hash code to get the text attribute characteristic At this time, the dimension of fATxt is smaller than that of R;
5.3)使用局部敏感再排序哈希算法检索;5.3) Retrieve using locality-sensitive reordering hash algorithm;
5.4)返回相似图像组I。5.4) Return similar image group I.
要实现上述发明内容,必须解决几个核心问题:1)针对图像特征提取难的问题,利用深度自编码卷积神经网络的强大的特征表征能力实现特征自适应提;2)针对大规模图像检索速度慢的问题,设计一种多任务分层方法,使用查询图像与数据库中图像快速比对;3)设计一种对车辆图像提取实例特征的方法用于相似类型车型检索;4)设计一种修正局部敏感哈希再排序码增大类内车辆图像之间的区别;5)利用端到端深度网络的优势,设计一种端到端的深度自编码卷积神经网络将检测、识别、特征提取融合到一个网络。To realize the above invention, several core problems must be solved: 1) Aiming at the difficult problem of image feature extraction, using the powerful feature representation ability of deep self-encoding convolutional neural network to realize feature adaptive extraction; 2) Aiming at large-scale image retrieval For the problem of slow speed, design a multi-task layering method, use the query image to quickly compare with the image in the database; 3) design a method for extracting instance features from vehicle images for retrieval of similar types of vehicles; 4) design a Modify the locality-sensitive hash reordering code to increase the difference between vehicle images in the class; 5) take advantage of the end-to-end deep network to design an end-to-end deep self-encoding convolutional neural network that combines detection, recognition, and feature extraction into one network.
本发明的基于多任务深度学习的修正局部敏感哈希再排序车辆检索方法,包括以下过程:1)将图像送入深度自编码卷积神经网络,在特征图上进行逻辑回归,对检索图像上的感兴趣区域进行位置、类别的分割、预测;2)使用多任务深度自编码卷积神经网络提取图像分段并行学习到的车辆属性哈希码;3)利用卷积特征层次结构的金字塔形状提取每辆车辆的实例特征;4)使用修正的局部敏感哈希方法对提取到的特征检索;5)对于无法获取车辆图像的情况下采用跨模态检索;The modified local sensitive hash reordering vehicle retrieval method based on multi-task deep learning of the present invention includes the following processes: 1) sending the image into a deep self-encoding convolutional neural network, performing logistic regression on the feature map, and performing a logical regression on the retrieved image 2) Use the multi-task deep self-encoding convolutional neural network to extract the vehicle attribute hash code learned in parallel by image segmentation; 3) Use the pyramid shape of the convolutional feature hierarchy Extract the instance features of each vehicle; 4) use the modified local sensitive hashing method to retrieve the extracted features; 5) use cross-modal retrieval when the vehicle image cannot be obtained;
本发明的有益效果主要表现在:The beneficial effects of the present invention are mainly manifested in:
1)提供了一种多任务端到端的卷积神经网络对车辆车型、车系、车标、颜色、车牌进行识别;1) A multi-task end-to-end convolutional neural network is provided to identify vehicle models, car series, car logos, colors, and license plates;
2)利用深度卷积神经网络强大的特征表征能力实现特征自适应提取;2) Use the powerful feature representation ability of deep convolutional neural network to realize feature adaptive extraction;
3)构建了修正局部敏感哈希再排序码对卷积网络提取到的特征高校检索;3) A modified local sensitive hash reordering code is constructed to retrieve the features extracted by the convolutional network;
4)本设计兼顾了通用性和专用性,在通用性方面,检索速度、精度和实用性等方面满足各类用户的需求;专用性方面用户根据自己的特定需求,做一个专用数据集并对网络参数进行微调后,实现一种面向特定应用的车辆检索的系统。4) This design takes both versatility and specificity into account. In terms of versatility, retrieval speed, accuracy and practicability meet the needs of various users; After the network parameters are fine-tuned, an application-specific vehicle retrieval system is realized.
附图说明Description of drawings
图1为整体检索流程图。Figure 1 is the overall retrieval flow chart.
图2为整体训练网络流程图。Figure 2 is a flowchart of the overall training network.
图3为RPN网络展开图。Figure 3 is an expanded view of the RPN network.
图4为车辆属性哈希码无法区分车辆示意图。Fig. 4 is a schematic diagram of vehicles that cannot be distinguished by vehicle attribute hash codes.
图5为文本特征向量生成图。Figure 5 is a graph of text feature vector generation.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the drawings in the embodiments of the present invention.
参照图1~图5,一种基于多任务深度学习的修正局部敏感哈希车辆检索方法,整体流程图如图1所示,首先对数据库中的图片送入用于深度学习和训练识别的多任务端到端的卷积神经网络,大量地训练和逐层递进的网络结构深入地学习车辆各种属性信息,包括车型、车系、车标、颜色、车牌;再利用此卷积网络提取车辆图像分段并行学习到的车辆属性哈希码,对构建的特征金字塔模块提取车辆的实例特征;对检索车辆图像与数据库中图像使用修正局部敏感哈希再排序方法进行比对。Referring to Figures 1 to 5, a modified local sensitive hash vehicle retrieval method based on multi-task deep learning. The overall flow chart is shown in Figure 1. First, the pictures in the database are sent to multiple Task end-to-end convolutional neural network, a large number of training and layer-by-layer progressive network structure to deeply learn various attribute information of vehicles, including model, car series, car logo, color, license plate; and then use this convolutional network to extract vehicles The vehicle attribute hash code learned in parallel by image segmentation is used to extract the instance features of the vehicle from the constructed feature pyramid module; the retrieved vehicle image is compared with the images in the database using the modified local sensitive hash reordering method.
所述用于深度学习和训练识别的多任务端到端的卷积神经网络含有共享卷积模块,感兴趣区域坐标回归和识别模块,多任务学习模块和实例特征提取模块,整体流程图如图2所示,共含有4个共享卷积模块和4层的特征金字塔模块;图2中双点划线是压缩层提取的车辆实例特征;图2中虚线部分框出的是提出的划分和编码模块,通过不同任务学习车辆的紧凑特征,最后将提取的两个特征向量进行融合;The multi-task end-to-end convolutional neural network used for deep learning and training recognition contains a shared convolution module, a region of interest coordinate regression and recognition module, a multi-task learning module and an instance feature extraction module. The overall flow chart is shown in Figure 2 As shown, there are a total of 4 shared convolution modules and 4 layers of feature pyramid modules; the double dotted line in Figure 2 is the vehicle instance feature extracted by the compression layer; the dotted line in Figure 2 is the proposed division and encoding module , learn the compact features of the vehicle through different tasks, and finally fuse the two extracted feature vectors;
本发明包括以下步骤:The present invention comprises the following steps:
1)共享卷积模块:共享网络由5个卷积模块组成,其中conv2_x到conv5_x的最后一层分别为{42,82,162,162}作为特征图的输出尺寸,conv1作为输入层只含有单层卷积层;1) Shared convolution module: The shared network consists of 5 convolution modules, where the last layer of conv2_x to conv5_x is {4 2 , 8 2 , 16 2 , 16 2 } as the output size of the feature map, and conv1 as the input The layer contains only a single-layer convolutional layer;
在共享卷积模块之后连接感兴趣区域坐标回归和识别模块,此模块可将任意大小的图像作为输入,输出目标区域的矩形预测框的集合,包含了每个预测框的位置坐标和数据集中类别的概率得分,为了生成区域建议框,首先输入图像经过卷积共享层生成特征图,然后在特征图上进行多尺度卷积操作,具体实现为:在每一个滑动窗口的位置使用3种尺度和3种长宽比,以当前滑动窗口中心为中心,并对应一种尺度和长宽比,则可以在原图上映射得到9种不同尺度的候选区域;如对于大小为w×h的共享卷积特征图,则总共有w×h×9个候选区域。最后,分类层输出w×h×9×2个候选区域的得分,即对每个区域是目标/非目标的估计概率,回归层输出w×h×9×4个参数,即候选区域的坐标参数,具体形式如图3所示;After the shared convolution module, the coordinate regression and recognition module of the region of interest is connected. This module can take an image of any size as input, and output a set of rectangular prediction frames of the target area, including the position coordinates of each prediction frame and the category in the data set. The probability score of , in order to generate a region suggestion box, first input the image through the convolution shared layer to generate a feature map, and then perform a multi-scale convolution operation on the feature map. The specific implementation is: use 3 scales and 3 kinds of aspect ratios, centered on the center of the current sliding window, and corresponding to one scale and aspect ratio, then 9 candidate regions of different scales can be mapped on the original image; for example, for a shared convolution with a size of w×h feature map, there are a total of w×h×9 candidate regions. Finally, the classification layer outputs the scores of w×h×9×2 candidate regions, that is, the estimated probability of being a target/non-target for each region, and the regression layer outputs w×h×9×4 parameters, which are the coordinates of the candidate regions Parameters, the specific form is shown in Figure 3;
训练RPN网络时,给每个候选区域分配一个二进制标签,以此来标注该区域是否是对象目标。具体操作如下:1)与某个真正目标区域(Ground Truth,GT)有最高的IoU(Intersection-over-Union,交集并集之比)重叠候选区域;2)与任意GT包围盒有大于0.7的IoU交叠的候选区域。分配负标签给与所有GT包围盒的IoU比率都低于0.3的候选区域;3)介于两者之间的舍弃。When training the RPN network, a binary label is assigned to each candidate region to mark whether the region is an object target. The specific operation is as follows: 1) It has the highest IoU (Intersection-over-Union, ratio of intersection and union) overlapping candidate area with a real target area (Ground Truth, GT); Candidate regions for IoU overlap. Assign negative labels to candidate regions whose IoU ratios to all GT bounding boxes are lower than 0.3; 3) discard in between.
有了这些定义,最小化目标函数。对一个图像的损失函数定义为:With these definitions, the objective function is minimized. The loss function for an image is defined as:
其中,i是第i个候选区域的索引,Pi是候选区域是第i类的概率。如果候选区域的标签为正,为1,如果候选区域标签为0,就是0。ti是一个向量,表示预测的包围盒的4个参数化坐标,是对应的GT包围盒的坐标向量。Ncls和Nreg分别为分类损失函数与位置回归损失函数的归一化系数,λ为两者之间的权重参数。分类损失函数Lcls是两个类别(目标vs.非目标)的对数损失:where i is the index of the ith candidate region, and P i is the probability that the candidate region is the ith class. If the label of the candidate region is positive, is 1, if the candidate region label is 0, It is 0. t i is a vector representing the 4 parameterized coordinates of the predicted bounding box, is the coordinate vector of the corresponding GT bounding box. N cls and N reg are the normalization coefficients of the classification loss function and the position regression loss function respectively, and λ is the weight parameter between the two. The classification loss function L cls is the log loss for two classes (target vs. non-target):
对于位置回归损失函数Lreg,用以下函数定义:For the positional regression loss function L reg , the following function is defined:
其中,R是鲁棒的损失函数(smooth L1)。Among them, R is a robust loss function (smooth L1).
然而,训练一个多任务深度学习网络并非是一件容易实现的过程,因为不同任务级别的信息有着各自不同的学习难点和收敛速度。因此,设计一个良好的多任务目标函数是至关重要的。多任务联合训练过程如下:假设,总任务数为T,对于第t个任务的训练数据记为其中t∈(1,T),i∈(1,N),N为总训练样本数。分别为第i样本的特征向量和标注标签。那么多任务目标函数则可以表示为:However, training a multi-task deep learning network is not an easy process, because information at different task levels has different learning difficulties and convergence speeds. Therefore, it is crucial to design a good multi-task objective function. The multi-task joint training process is as follows: Assume that the total number of tasks is T, and the training data for the t-th task is recorded as Where t∈(1,T), i∈(1,N), N is the total number of training samples. are the feature vector and label of the i-th sample, respectively. Then the multi-task objective function can be expressed as:
式中是输入特征向量和权重参数wt的映射函数,L(·)为损失函数,Φ(wt)为权重参数的正则化值。In the formula is the input feature vector and the mapping function of the weight parameter w t , L( ) is the loss function, and Φ(w t ) is the regularization value of the weight parameter.
对于损失函数,利用softmax配合对数似然代价函数训练最后一层的特征,实现图像分类。softmax损失函数定义如下:For the loss function, the features of the last layer are trained using softmax and the log-likelihood cost function to achieve image classification. The softmax loss function is defined as follows:
式中,xi为第i深度特征,Wj为最后一个全连接层中权重的第j列,b是偏置项,m,n分别为处理样本数量与类别数;In the formula, x i is the i-th depth feature, W j is the j-th column of the weight in the last fully connected layer, b is the bias item, m, n are the number of processing samples and the number of categories respectively;
卷积神经网络训练是一个反向传播过程,与BP算法类似,通过误差函数反向传播,利用随机梯度下降法对卷积参数和偏置进行优化调整,直到网络收敛或者达到最大迭代次数停止;Convolutional neural network training is a backpropagation process, similar to the BP algorithm, through the backpropagation of the error function, using the stochastic gradient descent method to optimize and adjust the convolution parameters and bias until the network converges or reaches the maximum number of iterations to stop;
该神经网络训练是一个反向传播过程,通过误差函数反向传播,利用随机梯度下降法对卷积参数和偏置进行优化调整,直到网络收敛或者达到最大迭代次数停止;The neural network training is a backpropagation process, through the backpropagation of the error function, the convolution parameters and bias are optimized and adjusted by the stochastic gradient descent method, until the network converges or reaches the maximum number of iterations to stop;
反向传播需要通过对带有标签的训练样本进行比较,采用平方误差代价函数,对于c个类别,N个训练样本的多类别进行识别,网络最终输出误差函数用公式(7)来计算误差,Backpropagation needs to compare the training samples with labels, and use the square error cost function to identify multiple categories of c categories and N training samples. The final output error function of the network uses formula (7) to calculate the error.
式中,EN为平方误差代价函数,为第n个样本对应标签的第k维,为第n个样本对应网络预测的第k个输出;In the formula, E N is the square error cost function, is the k-th dimension of the label corresponding to the n-th sample, The nth sample corresponds to the kth output predicted by the network;
对误差函数进行反向传播时,采用传统的BP算法类似的计算方法,具体公式形式,如公式(8)所示,When backpropagating the error function, a calculation method similar to the traditional BP algorithm is used, and the specific formula form is shown in formula (8),
δl=(Wl+1)Tδl+1×f'(ul)(ul=Wlxl-1+bl) (8)δ l =(W l+1 ) T δ l+1 ×f'(u l )(u l =W l x l-1 +b l ) (8)
式中,δl代表当前层的误差函数,δl+1代表上一层的误差函数,Wl+1为上一层映射矩阵,f'表示激活函数的反函数,即上采样,ul表示未通过激活函数的上一层的输出,xl-1表示下一层的输入,Wl为本层映射权值矩阵;In the formula, δ l represents the error function of the current layer, δ l+1 represents the error function of the previous layer, W l+1 is the mapping matrix of the previous layer, f' represents the inverse function of the activation function, that is, upsampling, u l Represents the output of the previous layer that has not passed the activation function, x l-1 represents the input of the next layer, and W l is the mapping weight matrix of this layer;
2)多任务在学习过程中存在任务之间的关联性,即任务之间存在信息共享,在同时训练多个任务时,网络利用任务之间的共享信息增强系统的归纳偏置能力和分类器的泛化能力;所述多任务网络通过在感兴趣区域模块后添加五个全连接层分为五个子任务,每个全连接后连接softmax激活函数将阈值归一化在[0,1]之间,再将归一化后的值送入分段函数促进二进制码的输出,通过分段学习和编码策略降低哈希码之间的冗余性从而增强学习到的特征的鲁棒性;2) There is a correlation between tasks in the learning process of multi-tasks, that is, there is information sharing between tasks. When training multiple tasks at the same time, the network uses the shared information between tasks to enhance the system's inductive bias ability and classifier The generalization ability; the multi-task network is divided into five subtasks by adding five fully connected layers after the region of interest module, and each fully connected softmax activation function normalizes the threshold between [0,1] Then, the normalized value is sent to the segmentation function to promote the output of the binary code, and the redundancy between the hash codes is reduced through segmentation learning and coding strategies to enhance the robustness of the learned features;
将多任务学习网络分为T个任务,每个任务中含有ct个类别,每个任务的全连接层一维向量输出使用mt表示;首先利用softmax激活函数将全连接层的输出归一化在[0,1]之间,公式具体表现形式如下:The multi-task learning network is divided into T tasks, each task contains c t categories, and the one-dimensional vector output of the fully connected layer of each task is represented by m t ; firstly, the output of the fully connected layer is normalized by using the softmax activation function between [0,1], the specific expression of the formula is as follows:
其中θ表示随机超平面;得到归一化后的值送入阈值分段函数进行二值化,得到全连接层的二进制输出,公式具体表现形式如下:Among them, θ represents a random hyperplane; the normalized value is sent to the threshold segmentation function for binarization, and the binary output of the fully connected layer is obtained. The specific expression of the formula is as follows:
最后为了获取多任务卷积网络分段并行学习到的车辆属性哈希码,对公式(10)得到的Ht向量再次进行一定比例的融合,使用向量fA表示,公式具体表现形式如下:Finally, in order to obtain the vehicle attribute hash code learned in parallel by the multi-task convolutional network, the H t vector obtained by the formula (10) is fused again in a certain proportion, and expressed by the vector f A. The specific expression of the formula is as follows:
fA=[α1H1;α2H2;...;αtHt] (11)f A = [α 1 H 1 ; α 2 H 2 ; . . . ; α t H t ] (11)
公式(11)中的αt具体表现形式如下:The specific form of α t in formula (11) is as follows:
在每个Ht之前乘上惩罚因子αt弥补了不同任务之间由于不同分类数造成的误差;Multiplying the penalty factor α t before each H t makes up for the error caused by different classification numbers between different tasks;
3)利用卷积特征层次结构的金字塔形状,同时创建一个在所有尺度上都具有强大语义的特征金字塔;为了实现这个目标依靠一种结构将低分辨率,语义上强大的特征与高分辨率,语义上弱的特征通过自顶向下的路径和横向连接相结合,并且可以从单个输入图像比例快速构建,可以用来代替特征化的图像金字塔而不牺牲代表性的特征,速度或内存;为了得到车辆图像的实例特征并且适应任意尺寸的卷积特征图的输入,选择共享模块conv2_x到conv5_x每个单元的最后一层并结合感兴趣区域模块的输出,再添加一个金字塔池化层和向量压缩层将三维特征压缩成一个一维特征向量,这样选择是因为即可丰富特征金字塔得到的特征图信息,每个阶段的最深层又有最强大的特征表示功能;3) Utilize the pyramidal shape of the convolutional feature hierarchy while creating a feature pyramid that is semantically strong at all scales; to achieve this goal relies on a structure that combines low-resolution, semantically strong features with high-resolution, Semantically weak features are combined via top-down paths and lateral connections, and can be quickly constructed from a single input image scale, and can be used to replace featurized image pyramids without sacrificing representative features, speed, or memory; for Get the instance features of the vehicle image and adapt to the input of the convolution feature map of any size, select the last layer of each unit of the shared module conv2_x to conv5_x and combine the output of the region of interest module, and then add a pyramid pooling layer and vector compression The layer compresses the three-dimensional features into a one-dimensional feature vector. This choice is because the feature map information obtained by the feature pyramid can be enriched, and the deepest layer of each stage has the most powerful feature representation function;
将每个模块的最后一层作为特征金字塔的输入,对于上述定义的网络conv2_x到conv5_x的最后一层依次选择{42,82,162,162}作为特征金字塔的输入特征图大小;用I表示输入图像,其长宽分别用字母h,w表示,第x层的共享卷积模块用convx_x表示,输入图像I后被激活为一个三维的特征向量T,维度为h′×w′×d是一系列二维特征图的集合,二维特征图长宽为h′×w′,T中含有d个二维特征图,用集合S=S{Sn},n∈(1,d)表示,Sn对应为第n个通道特征图;接着将三维的特征向量T送入特征金字塔,经过多个尺度卷积核卷积过后得到三维特征向量T′,维度为l×l×d,同样包含一组二维特征图,可用S′=S′{S′n},n∈(1,d)表示,其中Sn′对应为第n个通道特征图,每个特征图大小为l×l,一共包含d个;然后利用k×k大小的滑动窗口并且选择最大池化对特征图进行逻辑回归,得到一组l/k×l/k大小的特征图,再对每个通道的Sn′进行融合得到一维向量,依次对d个通道进行相同操作,最后得到的个性特征向量fB大小为(1,l/k×d)。最终的检索特征向量f如公式(13)所示:Take the last layer of each module as the input of the feature pyramid, and select {4 2 , 8 2 , 16 2 , 16 2 } as the input feature map size of the feature pyramid for the last layer of the network conv2_x to conv5_x defined above; Use I to represent the input image, and its length and width are represented by letters h and w respectively. The shared convolution module of the xth layer is represented by convx_x. After the input image I is activated into a three-dimensional feature vector T, the dimension is h′×w′ ×d is a set of a series of two-dimensional feature maps, the length and width of the two-dimensional feature maps are h′×w′, T contains d two-dimensional feature maps, and the set S=S{S n },n∈(1, d) indicates that S n corresponds to the feature map of the nth channel; then the three-dimensional feature vector T is sent into the feature pyramid, and the three-dimensional feature vector T′ is obtained after convolution with multiple scale convolution kernels, and the dimension is l×l× d, also contains a set of two-dimensional feature maps, which can be represented by S′=S′{S′ n }, n∈(1,d), where S n ′ corresponds to the nth channel feature map, and the size of each feature map It is l×l, including d in total; then use the sliding window of k×k size and select the maximum pooling to perform logistic regression on the feature map to obtain a set of feature maps of l/k×l/k size, and then for each Channel S n ′ is fused to obtain a one-dimensional vector, and the same operation is performed on d channels in turn, and the finally obtained personality feature vector f B has a size of (1,l/k×d). The final retrieval feature vector f is shown in formula (13):
f=[fA;fB] (13)f=[f A ; f B ] (13)
局部敏感哈希算法的基本思想是:将原始数据空间中的两个相邻数据点通过相同的映射或投影变换后,这两个数据点在新的数据空间中仍然相邻的概率很大,而不相邻的数据点被映射到同一个桶的概率很小。也就是说,如果我们对原始数据进行一些哈希映射后,我们希望原先相邻的两个数据能够被哈希到相同的桶内,具有相同的桶号。对原始数据集合中所有的数据都进行哈希映射后,这样就得到了一个哈希表,这些原始数据集被分散到了哈希表的桶内,每个桶会落入一些原始数据,属于同一个桶内的数据就有很大可能是相邻的,当然也存在不相邻的数据被哈希到了同一个桶内。因此,如果能够找到这样一些哈希函数,使得经过它们的哈希映射变换后,原始空间中相邻的数据落入相同的桶内的话,那么在该数据集合中进行近邻查找就变得容易了,只需要将查询数据进行哈希映射得到其桶号,然后取出该桶号对应桶内的所有数据,再进行线性匹配即可查找到与查询数据相邻的数据。换句话说,通过哈希函数映射变换操作,将原始数据集合分成了多个子集合,而每个子集合中的数据间是相邻的且该子集合中的元素个数较小,因此将一个在超大集合内查找相邻元素的问题转化为了在一个很小的集合内查找相邻元素的问题,这种算法能使得查找计算量大幅度下降;The basic idea of locality-sensitive hashing algorithm is: After transforming two adjacent data points in the original data space through the same mapping or projection, the probability that these two data points are still adjacent in the new data space is very high. The probability that non-adjacent data points are mapped to the same bucket is very small. That is to say, if we perform some hash mapping on the original data, we hope that two adjacent data can be hashed into the same bucket with the same bucket number. After hash mapping is performed on all the data in the original data set, a hash table is obtained. These original data sets are scattered into the buckets of the hash table, and each bucket will fall into some original data belonging to the same bucket. The data in a bucket is likely to be adjacent, and of course there are non-adjacent data that are hashed into the same bucket. Therefore, if some hash functions can be found so that after their hash map transformation, the adjacent data in the original space fall into the same bucket, then it becomes easy to search for neighbors in the data set , you only need to perform hash mapping on the query data to obtain its bucket number, then take out all the data in the bucket corresponding to the bucket number, and then perform linear matching to find the data adjacent to the query data. In other words, through the hash function mapping transformation operation, the original data set is divided into multiple sub-sets, and the data in each sub-set is adjacent and the number of elements in the sub-set is small, so one in The problem of finding adjacent elements in a very large set is transformed into the problem of finding adjacent elements in a small set. This algorithm can greatly reduce the amount of search calculations;
对于原本相邻的两个数据点经过哈希变换后落入相同桶内的哈希函数需要满足以下两个条件:For the hash function that two adjacent data points fall into the same bucket after hash transformation, the following two conditions need to be met:
如果d(x,y)≤d1,则h(x)=h(y)的概率至少为p1;If d(x,y)≤d1, then h(x)=h(y) with probability at least p1;
如果d(x,y)≥d2,则h(x)=h(y)的概率至多为p2;If d(x,y)≥d2, then the probability of h(x)=h(y) is at most p2;
其中d(x,y)表示x和y之间的距离,d1<d2,h(x)和h(y)分别表示对x和y进行哈希变换。Where d(x,y) represents the distance between x and y, d1<d2, h(x) and h(y) represent the hash transformation of x and y, respectively.
满足以上两个条件的哈希函数称为(d1,d2,p1,p2)-敏感。而通过一个或多个(d1,d2,p1,p2)-敏感的哈希函数对原始数据集合进行哈希生成一个或多个哈希表的过程称为局部敏感哈希。A hash function that satisfies the above two conditions is called (d1,d2,p1,p2)-sensitive. The process of generating one or more hash tables by hashing the original data set through one or more (d1, d2, p1, p2)-sensitive hash functions is called locality-sensitive hashing.
使用局部敏感哈希进行对海量数据建立索引,即哈希表并通过索引来进行近似最近邻查找的过程如下:The process of using local sensitive hash to index massive data, that is, hash table and perform approximate nearest neighbor search through index is as follows:
离线建立索引Offline indexing
(1)选取满足(d1,d2,p1,p2)-敏感的局部敏感哈希的哈希函数;(1) Select a hash function that satisfies (d1,d2,p1,p2)-sensitive locality-sensitive hashing;
(2)根据对查找结果的准确率,即相邻的数据被查找到的概率来确定哈希表的个数L,每个哈希表内的哈希函数的个数K,以及跟局部敏感哈希的哈希函数自身有关的参数;(2) Determine the number L of hash tables, the number K of hash functions in each hash table, and local sensitivity according to the accuracy of the search results, that is, the probability that adjacent data is found parameters related to the hash function itself;
(3)将所有数据经过局部敏感哈希的哈希函数哈希到相应的桶内,构成了一个或多个哈希表;(3) Hash all the data into corresponding buckets through the hash function of local sensitive hashing, forming one or more hash tables;
在线查找find online
(1)将查询数据经过局部敏感哈希的哈希函数哈希得到相应的桶号;(1) Hash the query data through the hash function of local sensitive hash to obtain the corresponding bucket number;
(2)将桶号中对应的数据取出;为了保证查找速度,只取出前2L个数据;(2) Take out the corresponding data in the bucket number; in order to ensure the search speed, only take out the first 2L data;
(3)计算查询数据与这2L个数据之间的相似度或距离,返回最近邻的数据;(3) Calculate the similarity or distance between the query data and the 2L data, and return the nearest neighbor data;
局部敏感哈希在线查找时间由两个部分组成:①通过局部敏感哈希的哈希函数计算哈希值,即计算桶号的时间;②将查询数据与桶内的数据进行比较计算的时间。因此,局部敏感哈希的查找时间至少是一个次线性时间。这是因为这里通过对桶内的属于建立索引来加快匹配速度,这时第②部分的耗时就从O(N)变成了O(logN)或O(1),极大的减少了计算量;The local sensitive hash online search time consists of two parts: ① the time to calculate the hash value through the hash function of the local sensitive hash, that is, the time to calculate the bucket number; ② the time to compare the query data with the data in the bucket. Therefore, the lookup time of locality-sensitive hashing is at least a sublinear time. This is because the matching speed is accelerated by indexing the belongings in the bucket. At this time, the time consumption of part ② changes from O(N) to O(logN) or O(1), which greatly reduces the calculation quantity;
所述的局部敏感哈希的一个关键是:将相似的样本映射到同一个具有高概率的同一个桶;局部敏感哈希的哈希函数h(.)满足以下条件:A key of the locality-sensitive hashing is: to map similar samples to the same bucket with high probability; the hash function h(.) of the locality-sensitive hashing satisfies the following conditions:
s{h(fAq)=h(fA)}=sim(fAq,fA) (14)s{h(f Aq )=h(f A )}=sim(f Aq ,f A ) (14)
式中,sim(fAq,fA)表示fAq与fA的相似度,h(fA)表示fA的哈希函数,h(fAq)表示fA的哈希函数,,其中的相似性度量与一个距离函数σ直接关联,如:In the formula, sim(f Aq , f A ) represents the similarity between f Aq and f A , h(f A ) represents the hash function of f A , h(f Aq ) represents the hash function of f A , where The similarity measure is directly related to a distance function σ, such as:
局部敏感哈希函数的典型分类由随机投影和阈值给出,如式(16)所示,A typical classification of locality-sensitive hash functions is given by random projections and thresholds, as shown in Equation (16),
h(fA)=sign(WfA+b) (16)h(f A )=sign(Wf A +b) (16)
式中,W是一个随机超平面向量,b是一个随机截距。where W is a random hyperplane vector and b is a random intercept.
所述的局部敏感哈希由预处理算法和最近邻搜索算法构成,通过这两个算法处理将将搜索图像特征表示成一串固定长度的二值编码;The locality-sensitive hash is composed of a preprocessing algorithm and a nearest neighbor search algorithm, and the search image features will be represented as a string of fixed-length binary codes through these two algorithm processing;
所述预处理算法的过程为:The process of the preprocessing algorithm is:
输入一组提取的图像特征p和哈希表的数l1,使用随机哈希函数g(.)对图像特征进行映射,将点pj存储到哈希表Ti相应的桶号gi(pj)中;输出哈希表Ti,i=1,…,l1;Input a set of extracted image features p and the number l 1 of the hash table, use the random hash function g(.) to map the image features, and store the point p j in the corresponding bucket number g i ( p j ); output hash table T i , i=1,...,l 1 ;
所述最近邻搜索算法的过程为:The process of the nearest neighbor search algorithm is:
输入一个检索图像特征q,访问由预处理算法所生成的哈希表Ti,i=1,…,l1最近邻的数目K,返回检索点q在数据集S中的K个最近邻数据;Input a retrieval image feature q, access the hash table T i generated by the preprocessing algorithm, i=1,...,l 1 the number K of the nearest neighbors, and return the K nearest neighbor data of the retrieval point q in the data set S ;
设Γ={I1,I2,…,In}为检索的由n个图像构成的数据集,其每幅图像所对应的二进制代码为ΓH={H1,H2,…,Hn},Hi∈{0,1}h;给定搜索图像Iq和二进制代码Hq,将Hq与Hi∈ΓH之间的汉明距离小于阈值TH的那些图像放入到候选池P中,为候选图像。Let Γ={I 1 ,I 2 ,…,In } be the retrieved data set consisting of n images, and the binary code corresponding to each image is Γ H ={H 1 ,H 2 ,…,H n }, H i ∈ {0,1} h ; given a search image I q and a binary code H q , put those images whose Hamming distance between H q and H i ∈ Γ H is smaller than the threshold T H into In the candidate pool P, for candidate images.
4)利用实例特征构建局部敏感再排序算法;在传统的局部敏感哈希算法中,返回的主要是距离上相近的图像,即检索图像和候选池中的图像相似度接近1;主要是因为通过低维的车辆属性哈希码映射后可得到相同型号的车辆,但是相同型号下的车辆仍存某些难以区分的情况,从人的主观判断上存在明显区别,但是仅仅通过车辆属性哈希码不能有效区别这些差异,如图4所示;为了排查出候选池中与检索图片含有相同个性特征的车辆,在检索图像经过车辆属性哈希码映射到各个桶中后,再利用获取到的图像实例特征,对桶中的图片再次进行排序缩小类内误差,再排序公式表现形式如下:4) Using instance features to build a local-sensitive reordering algorithm; in the traditional local-sensitive hashing algorithm, the returned images are mainly close in distance, that is, the similarity between the retrieved image and the image in the candidate pool is close to 1; the main reason is that through The same type of vehicle can be obtained after low-dimensional vehicle attribute hash code mapping, but there are still some indistinguishable situations between vehicles of the same model. There are obvious differences in human subjective judgment, but only through the vehicle attribute hash code These differences cannot be effectively distinguished, as shown in Figure 4; in order to find out the vehicles in the candidate pool that have the same personality characteristics as the retrieved pictures, after the retrieved images are mapped to each bucket through the vehicle attribute hash code, then use the obtained images Instance features, reorder the pictures in the bucket to reduce the intra-class error, and the reordering formula is expressed as follows:
在公式(17)中,k表示经过车辆属性哈希码映射选出的桶中第k个图像,表示惩罚因子且cos表示度量图像实例特征的余弦距离公式;为了排除车辆属性哈希码的错误映射,y表示映射前检索图像fAq与桶中图像的类型是否相等,如果相等则y为1,否则为0;In formula (17), k represents the kth image in the bucket selected by the vehicle attribute hash code mapping, represents the penalty factor and cos represents the cosine distance formula for measuring the characteristics of image instances; in order to eliminate the wrong mapping of vehicle attribute hash codes, y represents the retrieval image f Aq before mapping and the image in the bucket Whether the types of are equal, if they are equal, y is 1, otherwise it is 0;
在进一步排序中,已经将Hq与Hi∈ΓH之间的汉明距离小于阈值TH的那些图像放入到候选池P中,为了得到更为精准的搜索结果,本发明在候选池基础上进一步采用再排序方法;In further sorting, those images whose Hamming distance between H q and H i ∈ Γ H are smaller than the threshold T H have been put into the candidate pool P. In order to obtain more accurate search results, the present invention selects On the basis of further adopting the reordering method;
再排序方法,给定搜索图像Iq和候选池P,使用实例特征来确定从候选池P中图像的前k个排名图像;使用公式(17)计算它们之间的相似程度,The reranking method, given a search image I q and a candidate pool P, uses instance features to determine the top-k ranked images from the images in the candidate pool P; the similarity between them is calculated using Equation (17),
进一步,关于再排序的评价,使用一个以排名为基础的标准来进行评价;对于给定一个搜索图像Iq和一个相似性度量,对每个数据集图像进行一个排名;这里用评估前k个排名图像来表示一个搜索图像Iq的检索精度,用公式(18)表示;Further, regarding the evaluation of re-ranking, a ranking-based criterion is used for evaluation; given a search image I q and a similarity measure, a ranking is performed for each dataset image; here, the top k Ranking images to represent the retrieval accuracy of a search image I q , expressed by formula (18);
式中,Rel(i)表示搜索图像Iq与第i个排名图像之间的真实相关,k表示排名图像的个数,Precision@k搜索精度;在计算真实相关时,只考虑有分类标签部分,Rel(i)∈{0,1},如果搜索图像与第i个排名图像都具有相同的标签设置Rel(i)=1,否则设置Rel(i)=0,遍历候选池P中前k个排名图像就能得到搜索精度;In the formula, Rel(i) represents the true correlation between the search image I q and the i-th ranked image, k represents the number of ranked images, Precision@k search accuracy; when calculating the true correlation, only the part with classification labels is considered , Rel(i)∈{0,1}, if the search image has the same label as the i-th ranked image, set Rel(i)=1, otherwise set Rel(i)=0, traverse the top k in the candidate pool P The search accuracy can be obtained by ranking images;
5)在无法获取检索的图像时,采用文本检索方式进行辅助检索,在不增加额外训练的情况下,使得通过文本得到的检索特征和卷积网络得到的文本特征可共用一套检索方式,如果某个文本中含有车辆描述信息辨识标记符,如图5所示,其文本获取特征方法如下:5) When the retrieved image cannot be obtained, the text retrieval method is used for auxiliary retrieval. Without additional training, the retrieval features obtained through the text and the text features obtained by the convolutional network can share a set of retrieval methods. If A certain text contains a vehicle description information identification tag, as shown in Figure 5, the method for obtaining the features of the text is as follows:
初始化过程为:文本文件解析成词条向量;去除小词、重复词;检查词条确保解析的正确性;The initialization process is: the text file is parsed into entry vectors; small words and repeated words are removed; the entries are checked to ensure the correctness of the analysis;
5.1)从输入文本O中取出随机组合的分词最小向量R=(r1,r2,...,rn);5.1) Take out the word segmentation minimum vector R=(r 1 ,r 2 ,...,r n ) of random combination from the input text O;
5.2)对R与fA顺序及车辆属性哈希码整合,得到文本属性特此时的fATxt维度小于R的维度;5.2) Integrate R and f A sequence and vehicle attribute hash code to get the text attribute characteristic At this time, the dimension of fATxt is smaller than that of R;
5.3)使用局部敏感再排序哈希算法检索;5.3) Retrieve using locality-sensitive reordering hash algorithm;
5.4)返回相似图像组I;5.4) return similar image group I;
以上所述仅为本发明的较佳实施举例,并不用于限制本发明,凡在本发明精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only examples of the preferred implementation of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention within.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711135951.XA CN108108657B (en) | 2017-11-16 | 2017-11-16 | Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711135951.XA CN108108657B (en) | 2017-11-16 | 2017-11-16 | Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108657A true CN108108657A (en) | 2018-06-01 |
CN108108657B CN108108657B (en) | 2020-10-30 |
Family
ID=62206830
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711135951.XA Active CN108108657B (en) | 2017-11-16 | 2017-11-16 | Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108657B (en) |
Cited By (70)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108791302A (en) * | 2018-06-25 | 2018-11-13 | 大连大学 | Driving behavior modeling |
CN108791308A (en) * | 2018-06-25 | 2018-11-13 | 大连大学 | The system for building driving strategy based on driving environment |
CN108819948A (en) * | 2018-06-25 | 2018-11-16 | 大连大学 | Driving behavior modeling method based on reverse intensified learning |
CN108891421A (en) * | 2018-06-25 | 2018-11-27 | 大连大学 | A method of building driving strategy |
CN108944940A (en) * | 2018-06-25 | 2018-12-07 | 大连大学 | Driving behavior modeling method neural network based |
CN109086866A (en) * | 2018-07-02 | 2018-12-25 | 重庆大学 | A kind of part two-value convolution method suitable for embedded device |
CN109147940A (en) * | 2018-07-05 | 2019-01-04 | 北京昆仑医云科技有限公司 | From the device and system of the medical image automatic Prediction physiological status of patient |
CN109144648A (en) * | 2018-08-21 | 2019-01-04 | 第四范式(北京)技术有限公司 | Uniformly execute the method and system of feature extraction |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109242019A (en) * | 2018-09-01 | 2019-01-18 | 哈尔滨工程大学 | A kind of water surface optics Small object quickly detects and tracking |
CN109241322A (en) * | 2018-08-28 | 2019-01-18 | 北京地平线机器人技术研发有限公司 | Code generating method, code generating unit and electronic equipment |
CN109299097A (en) * | 2018-09-27 | 2019-02-01 | 宁波大学 | An online nearest neighbor query method for high-dimensional data based on hash learning |
CN109583305A (en) * | 2018-10-30 | 2019-04-05 | 南昌大学 | A kind of advanced method that the vehicle based on critical component identification and fine grit classification identifies again |
CN109614512A (en) * | 2018-11-29 | 2019-04-12 | 亿嘉和科技股份有限公司 | A kind of power equipment search method based on deep learning |
CN110019652A (en) * | 2019-03-14 | 2019-07-16 | 九江学院 | A kind of cross-module state Hash search method based on deep learning |
CN110059634A (en) * | 2019-04-19 | 2019-07-26 | 山东博昂信息科技有限公司 | A kind of large scene face snap method |
CN110059771A (en) * | 2019-05-10 | 2019-07-26 | 合肥工业大学 | A kind of interactive vehicle data classification method in the case where sequence is supported |
CN110059967A (en) * | 2019-04-23 | 2019-07-26 | 北京相数科技有限公司 | A kind of data processing method and device applied to city decision Analysis |
CN110110325A (en) * | 2019-04-22 | 2019-08-09 | 北京明略软件系统有限公司 | It is a kind of to repeat case lookup method and device, computer readable storage medium |
CN110135470A (en) * | 2019-04-24 | 2019-08-16 | 电子科技大学 | A vehicle feature fusion system based on multi-modal vehicle feature recognition |
CN110135419A (en) * | 2019-05-06 | 2019-08-16 | 南京大学 | An end-to-end text recognition method in natural scenes |
CN110189394A (en) * | 2019-05-14 | 2019-08-30 | 北京字节跳动网络技术有限公司 | Shape of the mouth as one speaks generation method, device and electronic equipment |
CN110211109A (en) * | 2019-05-30 | 2019-09-06 | 西安电子科技大学 | Image change detection method based on deep neural network structure optimizing |
CN110309888A (en) * | 2019-07-11 | 2019-10-08 | 南京邮电大学 | A method and system for image classification based on hierarchical multi-task learning |
CN110362543A (en) * | 2019-07-22 | 2019-10-22 | 武汉上善仿真科技有限责任公司 | A kind of the filename appointing system and application method of auto Body Technology solution |
CN110427509A (en) * | 2019-08-05 | 2019-11-08 | 山东浪潮人工智能研究院有限公司 | A kind of multi-scale feature fusion image Hash search method and system based on deep learning |
CN110443266A (en) * | 2018-05-04 | 2019-11-12 | 上海商汤智能科技有限公司 | Object prediction method and device, electronic equipment and storage medium |
CN110516640A (en) * | 2019-08-30 | 2019-11-29 | 华侨大学 | A Vehicle Re-Identification Method Based on Feature Pyramid Joint Representation |
CN110532904A (en) * | 2019-08-13 | 2019-12-03 | 桂林电子科技大学 | A kind of vehicle identification method |
CN110543600A (en) * | 2019-09-11 | 2019-12-06 | 上海携程国际旅行社有限公司 | Search ranking method, system, device and storage medium based on neural network |
CN110580503A (en) * | 2019-08-22 | 2019-12-17 | 江苏和正特种装备有限公司 | AI-based double-spectrum target automatic identification method |
WO2020014770A1 (en) * | 2018-07-17 | 2020-01-23 | Avigilon Corporation | Hash-based appearance search |
CN110738248A (en) * | 2019-09-30 | 2020-01-31 | 朔黄铁路发展有限责任公司 | State perception data feature extraction method and device and system performance evaluation method |
CN110751122A (en) * | 2019-10-28 | 2020-02-04 | 中国电子科技集团公司第二十八研究所 | License plate classification and identification method based on Gabor characteristic self-encoder |
CN110879846A (en) * | 2018-09-05 | 2020-03-13 | 深圳云天励飞技术有限公司 | Image retrieval method and apparatus, electronic device and computer-readable storage medium |
CN111046166A (en) * | 2019-12-10 | 2020-04-21 | 中山大学 | A semi-implicit multimodal recommendation method based on similarity correction |
CN111325061A (en) * | 2018-12-14 | 2020-06-23 | 顺丰科技有限公司 | Vehicle detection algorithm, device and storage medium based on deep learning |
CN111460200A (en) * | 2020-03-04 | 2020-07-28 | 西北大学 | Image retrieval method and model based on multitask deep learning and construction method thereof |
CN111523403A (en) * | 2020-04-03 | 2020-08-11 | 咪咕文化科技有限公司 | Method and device for acquiring target area in picture and computer readable storage medium |
CN111581471A (en) * | 2020-05-09 | 2020-08-25 | 北京京东振世信息技术有限公司 | Regional vehicle checking method, device, server and medium |
CN111666898A (en) * | 2020-06-09 | 2020-09-15 | 北京字节跳动网络技术有限公司 | Method and device for identifying class to which vehicle belongs |
CN111814023A (en) * | 2020-07-30 | 2020-10-23 | 广州威尔森信息科技有限公司 | Automobile model network price monitoring system |
CN111814751A (en) * | 2020-08-14 | 2020-10-23 | 深延科技(北京)有限公司 | Vehicle attribute analysis method and system based on deep learning target detection and image recognition |
CN111881312A (en) * | 2020-07-24 | 2020-11-03 | 成都成信高科信息技术有限公司 | Image data set classification and division method |
US10846554B2 (en) | 2018-07-17 | 2020-11-24 | Avigilon Corporation | Hash-based appearance search |
CN112446431A (en) * | 2020-11-27 | 2021-03-05 | 鹏城实验室 | Feature point extraction and matching method, network, device and computer storage medium |
CN112507862A (en) * | 2020-12-04 | 2021-03-16 | 东风汽车集团有限公司 | Vehicle orientation detection method and system based on multitask convolutional neural network |
CN112686125A (en) * | 2020-12-25 | 2021-04-20 | 浙江大华技术股份有限公司 | Vehicle type determination method and device, storage medium and electronic device |
CN112699953A (en) * | 2021-01-07 | 2021-04-23 | 北京大学 | Characteristic pyramid neural network architecture searching method based on multi-information path aggregation |
CN112699402A (en) * | 2020-12-28 | 2021-04-23 | 广西师范大学 | Wearable device activity prediction method based on federal personalized random forest |
CN112906804A (en) * | 2021-03-02 | 2021-06-04 | 华南理工大学 | Hash sample balance cancer labeling method for histopathology image |
CN113076962A (en) * | 2021-05-14 | 2021-07-06 | 电子科技大学 | Multi-scale target detection method based on micro neural network search technology |
CN113378972A (en) * | 2021-06-28 | 2021-09-10 | 成都恒创新星科技有限公司 | License plate recognition method and system in complex scene |
CN113470001A (en) * | 2021-07-22 | 2021-10-01 | 西北工业大学 | Target searching method for infrared image |
CN114297582A (en) * | 2021-12-28 | 2022-04-08 | 浙江大学 | Modeling method of discrete counting data based on multi-probe locality sensitive Hash negative binomial regression model |
CN114332590A (en) * | 2022-03-08 | 2022-04-12 | 北京百度网讯科技有限公司 | Joint perception model training method, joint perception device, joint perception equipment and medium |
US11341631B2 (en) | 2017-08-09 | 2022-05-24 | Shenzhen Keya Medical Technology Corporation | System and method for automatically detecting a physiological condition from a medical image of a patient |
CN114648107A (en) * | 2022-03-10 | 2022-06-21 | 北京宏景智驾科技有限公司 | Method and circuit for improving efficiency of calculation of neural network input image point cloud convolution layer |
CN114743235A (en) * | 2022-03-01 | 2022-07-12 | 东南大学 | Micro-expression identification method and system based on sparsification self-attention mechanism |
CN114911965A (en) * | 2022-04-19 | 2022-08-16 | 超级视线科技有限公司 | Vehicle information query method and system |
CN114972761A (en) * | 2022-06-20 | 2022-08-30 | 平安科技(深圳)有限公司 | Artificial intelligence-based vehicle part segmentation method and related equipment |
CN115102982A (en) * | 2021-11-19 | 2022-09-23 | 北京邮电大学 | An Intelligent Task-Oriented Semantic Communication Method |
CN115357747A (en) * | 2022-10-18 | 2022-11-18 | 山东建筑大学 | An image retrieval method and system based on ordinal hash |
US20220414144A1 (en) * | 2021-06-29 | 2022-12-29 | Shandong Jianzhu University | Multi-task deep hash learning-based retrieval method for massive logistics product images |
CN115994537A (en) * | 2023-01-09 | 2023-04-21 | 杭州实在智能科技有限公司 | Multitask learning method and system for solving entity overlapping and entity nesting |
CN116108217A (en) * | 2022-10-27 | 2023-05-12 | 浙江大学 | A similar image retrieval method for toll-dodging vehicles based on deep hash coding and multi-task prediction |
CN116994073A (en) * | 2023-09-27 | 2023-11-03 | 江西师范大学 | A graph comparison learning method and device for adaptive positive and negative sample generation |
CN117171382A (en) * | 2023-07-28 | 2023-12-05 | 宁波善德电子集团有限公司 | Vehicle video retrieval method based on comprehensive features and natural language |
CN118585721A (en) * | 2024-08-03 | 2024-09-03 | 凯泰铭科技(北京)有限公司 | HTTP request and response monitoring method and system based on browser extension |
TWI859496B (en) * | 2018-07-13 | 2024-10-21 | 荷蘭商Asml荷蘭公司 | Computer readable medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105808732A (en) * | 2016-03-10 | 2016-07-27 | 北京大学 | Integration target attribute identification and precise retrieval method based on depth measurement learning |
CN106227851A (en) * | 2016-07-29 | 2016-12-14 | 汤平 | Based on the image search method searched for by depth of seam division that degree of depth convolutional neural networks is end-to-end |
CN106776856A (en) * | 2016-11-29 | 2017-05-31 | 江南大学 | A kind of vehicle image search method of Fusion of Color feature and words tree |
CN106886573A (en) * | 2017-01-19 | 2017-06-23 | 博康智能信息技术有限公司 | A kind of image search method and device |
US20170213093A1 (en) * | 2016-01-27 | 2017-07-27 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for detecting vehicle contour based on point cloud data |
-
2017
- 2017-11-16 CN CN201711135951.XA patent/CN108108657B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170213093A1 (en) * | 2016-01-27 | 2017-07-27 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for detecting vehicle contour based on point cloud data |
CN105808732A (en) * | 2016-03-10 | 2016-07-27 | 北京大学 | Integration target attribute identification and precise retrieval method based on depth measurement learning |
CN106227851A (en) * | 2016-07-29 | 2016-12-14 | 汤平 | Based on the image search method searched for by depth of seam division that degree of depth convolutional neural networks is end-to-end |
CN106776856A (en) * | 2016-11-29 | 2017-05-31 | 江南大学 | A kind of vehicle image search method of Fusion of Color feature and words tree |
CN106886573A (en) * | 2017-01-19 | 2017-06-23 | 博康智能信息技术有限公司 | A kind of image search method and device |
Cited By (103)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11341631B2 (en) | 2017-08-09 | 2022-05-24 | Shenzhen Keya Medical Technology Corporation | System and method for automatically detecting a physiological condition from a medical image of a patient |
CN110443266A (en) * | 2018-05-04 | 2019-11-12 | 上海商汤智能科技有限公司 | Object prediction method and device, electronic equipment and storage medium |
CN110443266B (en) * | 2018-05-04 | 2022-06-24 | 上海商汤智能科技有限公司 | Object prediction method and device, electronic equipment and storage medium |
US11593596B2 (en) | 2018-05-04 | 2023-02-28 | Shanghai Sense Time Intelligent Technology Co., Ltd. | Object prediction method and apparatus, and storage medium |
CN108891421A (en) * | 2018-06-25 | 2018-11-27 | 大连大学 | A method of building driving strategy |
CN108944940A (en) * | 2018-06-25 | 2018-12-07 | 大连大学 | Driving behavior modeling method neural network based |
CN108819948A (en) * | 2018-06-25 | 2018-11-16 | 大连大学 | Driving behavior modeling method based on reverse intensified learning |
CN108791302B (en) * | 2018-06-25 | 2020-05-19 | 大连大学 | Driver behavior modeling system |
CN108791302A (en) * | 2018-06-25 | 2018-11-13 | 大连大学 | Driving behavior modeling |
CN108791308A (en) * | 2018-06-25 | 2018-11-13 | 大连大学 | The system for building driving strategy based on driving environment |
CN109086866A (en) * | 2018-07-02 | 2018-12-25 | 重庆大学 | A kind of part two-value convolution method suitable for embedded device |
CN109086866B (en) * | 2018-07-02 | 2021-07-30 | 重庆大学 | A Partial Binary Convolution Method for Embedded Devices |
CN109147940A (en) * | 2018-07-05 | 2019-01-04 | 北京昆仑医云科技有限公司 | From the device and system of the medical image automatic Prediction physiological status of patient |
TWI859496B (en) * | 2018-07-13 | 2024-10-21 | 荷蘭商Asml荷蘭公司 | Computer readable medium |
WO2020014770A1 (en) * | 2018-07-17 | 2020-01-23 | Avigilon Corporation | Hash-based appearance search |
US10846554B2 (en) | 2018-07-17 | 2020-11-24 | Avigilon Corporation | Hash-based appearance search |
AU2019303730B2 (en) * | 2018-07-17 | 2022-01-20 | Motorola Solutions, Inc. | Hash-based appearance search |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109165306B (en) * | 2018-08-09 | 2021-11-23 | 长沙理工大学 | Image retrieval method based on multitask Hash learning |
CN109144648B (en) * | 2018-08-21 | 2020-06-23 | 第四范式(北京)技术有限公司 | Method and system for uniformly performing feature extraction |
CN109144648A (en) * | 2018-08-21 | 2019-01-04 | 第四范式(北京)技术有限公司 | Uniformly execute the method and system of feature extraction |
CN109241322B (en) * | 2018-08-28 | 2020-09-11 | 北京地平线机器人技术研发有限公司 | Code generation method, code generation device and electronic equipment |
CN109241322A (en) * | 2018-08-28 | 2019-01-18 | 北京地平线机器人技术研发有限公司 | Code generating method, code generating unit and electronic equipment |
CN109242019A (en) * | 2018-09-01 | 2019-01-18 | 哈尔滨工程大学 | A kind of water surface optics Small object quickly detects and tracking |
CN110879846A (en) * | 2018-09-05 | 2020-03-13 | 深圳云天励飞技术有限公司 | Image retrieval method and apparatus, electronic device and computer-readable storage medium |
CN109299097B (en) * | 2018-09-27 | 2022-06-21 | 宁波大学 | An online nearest neighbor query method for high-dimensional data based on hash learning |
CN109299097A (en) * | 2018-09-27 | 2019-02-01 | 宁波大学 | An online nearest neighbor query method for high-dimensional data based on hash learning |
CN109583305A (en) * | 2018-10-30 | 2019-04-05 | 南昌大学 | A kind of advanced method that the vehicle based on critical component identification and fine grit classification identifies again |
CN109614512A (en) * | 2018-11-29 | 2019-04-12 | 亿嘉和科技股份有限公司 | A kind of power equipment search method based on deep learning |
CN111325061B (en) * | 2018-12-14 | 2023-05-23 | 顺丰科技有限公司 | Vehicle detection algorithm, device and storage medium based on deep learning |
CN111325061A (en) * | 2018-12-14 | 2020-06-23 | 顺丰科技有限公司 | Vehicle detection algorithm, device and storage medium based on deep learning |
CN110019652A (en) * | 2019-03-14 | 2019-07-16 | 九江学院 | A kind of cross-module state Hash search method based on deep learning |
CN110059634A (en) * | 2019-04-19 | 2019-07-26 | 山东博昂信息科技有限公司 | A kind of large scene face snap method |
CN110110325A (en) * | 2019-04-22 | 2019-08-09 | 北京明略软件系统有限公司 | It is a kind of to repeat case lookup method and device, computer readable storage medium |
CN110059967A (en) * | 2019-04-23 | 2019-07-26 | 北京相数科技有限公司 | A kind of data processing method and device applied to city decision Analysis |
CN110135470A (en) * | 2019-04-24 | 2019-08-16 | 电子科技大学 | A vehicle feature fusion system based on multi-modal vehicle feature recognition |
CN110135419A (en) * | 2019-05-06 | 2019-08-16 | 南京大学 | An end-to-end text recognition method in natural scenes |
CN110059771A (en) * | 2019-05-10 | 2019-07-26 | 合肥工业大学 | A kind of interactive vehicle data classification method in the case where sequence is supported |
CN110189394A (en) * | 2019-05-14 | 2019-08-30 | 北京字节跳动网络技术有限公司 | Shape of the mouth as one speaks generation method, device and electronic equipment |
CN110189394B (en) * | 2019-05-14 | 2020-12-29 | 北京字节跳动网络技术有限公司 | Mouth shape generation method and device and electronic equipment |
CN110211109A (en) * | 2019-05-30 | 2019-09-06 | 西安电子科技大学 | Image change detection method based on deep neural network structure optimizing |
CN110211109B (en) * | 2019-05-30 | 2022-12-06 | 西安电子科技大学 | Image change detection method based on deep neural network structure optimization |
CN110309888A (en) * | 2019-07-11 | 2019-10-08 | 南京邮电大学 | A method and system for image classification based on hierarchical multi-task learning |
CN110362543A (en) * | 2019-07-22 | 2019-10-22 | 武汉上善仿真科技有限责任公司 | A kind of the filename appointing system and application method of auto Body Technology solution |
CN110427509A (en) * | 2019-08-05 | 2019-11-08 | 山东浪潮人工智能研究院有限公司 | A kind of multi-scale feature fusion image Hash search method and system based on deep learning |
CN110532904A (en) * | 2019-08-13 | 2019-12-03 | 桂林电子科技大学 | A kind of vehicle identification method |
CN110580503A (en) * | 2019-08-22 | 2019-12-17 | 江苏和正特种装备有限公司 | AI-based double-spectrum target automatic identification method |
CN110516640B (en) * | 2019-08-30 | 2022-09-30 | 华侨大学 | Vehicle re-identification method based on feature pyramid joint representation |
CN110516640A (en) * | 2019-08-30 | 2019-11-29 | 华侨大学 | A Vehicle Re-Identification Method Based on Feature Pyramid Joint Representation |
CN110543600A (en) * | 2019-09-11 | 2019-12-06 | 上海携程国际旅行社有限公司 | Search ranking method, system, device and storage medium based on neural network |
CN110738248B (en) * | 2019-09-30 | 2022-09-27 | 朔黄铁路发展有限责任公司 | State perception data feature extraction method and device and system performance evaluation method |
CN110738248A (en) * | 2019-09-30 | 2020-01-31 | 朔黄铁路发展有限责任公司 | State perception data feature extraction method and device and system performance evaluation method |
CN110751122A (en) * | 2019-10-28 | 2020-02-04 | 中国电子科技集团公司第二十八研究所 | License plate classification and identification method based on Gabor characteristic self-encoder |
CN111046166A (en) * | 2019-12-10 | 2020-04-21 | 中山大学 | A semi-implicit multimodal recommendation method based on similarity correction |
CN111460200B (en) * | 2020-03-04 | 2023-07-04 | 西北大学 | Image retrieval method and model based on multitask deep learning and construction method thereof |
CN111460200A (en) * | 2020-03-04 | 2020-07-28 | 西北大学 | Image retrieval method and model based on multitask deep learning and construction method thereof |
CN111523403B (en) * | 2020-04-03 | 2023-10-20 | 咪咕文化科技有限公司 | Method and device for acquiring target area in picture and computer readable storage medium |
CN111523403A (en) * | 2020-04-03 | 2020-08-11 | 咪咕文化科技有限公司 | Method and device for acquiring target area in picture and computer readable storage medium |
CN111581471B (en) * | 2020-05-09 | 2023-11-10 | 北京京东振世信息技术有限公司 | Regional vehicle checking method, device, server and medium |
CN111581471A (en) * | 2020-05-09 | 2020-08-25 | 北京京东振世信息技术有限公司 | Regional vehicle checking method, device, server and medium |
CN111666898A (en) * | 2020-06-09 | 2020-09-15 | 北京字节跳动网络技术有限公司 | Method and device for identifying class to which vehicle belongs |
CN111881312A (en) * | 2020-07-24 | 2020-11-03 | 成都成信高科信息技术有限公司 | Image data set classification and division method |
CN111814023A (en) * | 2020-07-30 | 2020-10-23 | 广州威尔森信息科技有限公司 | Automobile model network price monitoring system |
CN111814023B (en) * | 2020-07-30 | 2021-06-15 | 广州威尔森信息科技有限公司 | Automobile model network price monitoring system |
CN111814751A (en) * | 2020-08-14 | 2020-10-23 | 深延科技(北京)有限公司 | Vehicle attribute analysis method and system based on deep learning target detection and image recognition |
CN112446431A (en) * | 2020-11-27 | 2021-03-05 | 鹏城实验室 | Feature point extraction and matching method, network, device and computer storage medium |
CN112507862A (en) * | 2020-12-04 | 2021-03-16 | 东风汽车集团有限公司 | Vehicle orientation detection method and system based on multitask convolutional neural network |
CN112507862B (en) * | 2020-12-04 | 2023-05-26 | 东风汽车集团有限公司 | Vehicle orientation detection method and system based on multitasking convolutional neural network |
CN112686125A (en) * | 2020-12-25 | 2021-04-20 | 浙江大华技术股份有限公司 | Vehicle type determination method and device, storage medium and electronic device |
CN112699402B (en) * | 2020-12-28 | 2022-06-17 | 广西师范大学 | Wearable device activity prediction method based on federated personalized random forest |
CN112699402A (en) * | 2020-12-28 | 2021-04-23 | 广西师范大学 | Wearable device activity prediction method based on federal personalized random forest |
CN112699953B (en) * | 2021-01-07 | 2024-03-19 | 北京大学 | Feature pyramid neural network architecture searching method based on multi-information path aggregation |
CN112699953A (en) * | 2021-01-07 | 2021-04-23 | 北京大学 | Characteristic pyramid neural network architecture searching method based on multi-information path aggregation |
CN112906804A (en) * | 2021-03-02 | 2021-06-04 | 华南理工大学 | Hash sample balance cancer labeling method for histopathology image |
CN112906804B (en) * | 2021-03-02 | 2023-12-19 | 华南理工大学 | Hash sample balance cancer labeling method for histopathological image |
CN113076962A (en) * | 2021-05-14 | 2021-07-06 | 电子科技大学 | Multi-scale target detection method based on micro neural network search technology |
CN113378972A (en) * | 2021-06-28 | 2021-09-10 | 成都恒创新星科技有限公司 | License plate recognition method and system in complex scene |
CN113378972B (en) * | 2021-06-28 | 2024-03-22 | 成都恒创新星科技有限公司 | License plate recognition method and system under complex scene |
US20220414144A1 (en) * | 2021-06-29 | 2022-12-29 | Shandong Jianzhu University | Multi-task deep hash learning-based retrieval method for massive logistics product images |
US12158912B2 (en) * | 2021-06-29 | 2024-12-03 | Shandong Jianzhu University | Multi-task deep hash learning-based retrieval method for massive logistics product images |
CN113470001B (en) * | 2021-07-22 | 2024-01-09 | 西北工业大学 | Target searching method for infrared image |
CN113470001A (en) * | 2021-07-22 | 2021-10-01 | 西北工业大学 | Target searching method for infrared image |
CN115102982A (en) * | 2021-11-19 | 2022-09-23 | 北京邮电大学 | An Intelligent Task-Oriented Semantic Communication Method |
CN115102982B (en) * | 2021-11-19 | 2023-06-23 | 北京邮电大学 | A Semantic Communication Method Oriented to Intelligent Tasks |
CN114297582A (en) * | 2021-12-28 | 2022-04-08 | 浙江大学 | Modeling method of discrete counting data based on multi-probe locality sensitive Hash negative binomial regression model |
CN114743235A (en) * | 2022-03-01 | 2022-07-12 | 东南大学 | Micro-expression identification method and system based on sparsification self-attention mechanism |
CN114332590A (en) * | 2022-03-08 | 2022-04-12 | 北京百度网讯科技有限公司 | Joint perception model training method, joint perception device, joint perception equipment and medium |
CN114648107A (en) * | 2022-03-10 | 2022-06-21 | 北京宏景智驾科技有限公司 | Method and circuit for improving efficiency of calculation of neural network input image point cloud convolution layer |
CN114911965B (en) * | 2022-04-19 | 2025-02-14 | 超级视线科技有限公司 | Vehicle information query method and system |
CN114911965A (en) * | 2022-04-19 | 2022-08-16 | 超级视线科技有限公司 | Vehicle information query method and system |
CN114972761B (en) * | 2022-06-20 | 2024-05-07 | 平安科技(深圳)有限公司 | Vehicle part segmentation method based on artificial intelligence and related equipment |
CN114972761A (en) * | 2022-06-20 | 2022-08-30 | 平安科技(深圳)有限公司 | Artificial intelligence-based vehicle part segmentation method and related equipment |
CN115357747A (en) * | 2022-10-18 | 2022-11-18 | 山东建筑大学 | An image retrieval method and system based on ordinal hash |
CN115357747B (en) * | 2022-10-18 | 2024-03-26 | 山东建筑大学 | An image retrieval method and system based on ordinal hashing |
CN116108217B (en) * | 2022-10-27 | 2023-12-19 | 浙江大学 | Fee evasion vehicle similar picture retrieval method based on depth hash coding and multitask prediction |
CN116108217A (en) * | 2022-10-27 | 2023-05-12 | 浙江大学 | A similar image retrieval method for toll-dodging vehicles based on deep hash coding and multi-task prediction |
CN115994537A (en) * | 2023-01-09 | 2023-04-21 | 杭州实在智能科技有限公司 | Multitask learning method and system for solving entity overlapping and entity nesting |
CN117171382B (en) * | 2023-07-28 | 2024-05-03 | 宁波善德电子集团有限公司 | Vehicle video retrieval method based on comprehensive features and natural language |
CN117171382A (en) * | 2023-07-28 | 2023-12-05 | 宁波善德电子集团有限公司 | Vehicle video retrieval method based on comprehensive features and natural language |
CN116994073B (en) * | 2023-09-27 | 2024-01-26 | 江西师范大学 | Graph contrast learning method and device for self-adaptive positive and negative sample generation |
CN116994073A (en) * | 2023-09-27 | 2023-11-03 | 江西师范大学 | A graph comparison learning method and device for adaptive positive and negative sample generation |
CN118585721A (en) * | 2024-08-03 | 2024-09-03 | 凯泰铭科技(北京)有限公司 | HTTP request and response monitoring method and system based on browser extension |
CN118585721B (en) * | 2024-08-03 | 2024-11-01 | 凯泰铭科技(北京)有限公司 | HTTP request and response monitoring method and system based on browser extension |
Also Published As
Publication number | Publication date |
---|---|
CN108108657B (en) | 2020-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108657B (en) | Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning | |
CN107885764B (en) | A fast hash vehicle retrieval method based on multi-task deep learning | |
Wang et al. | Visual saliency guided complex image retrieval | |
CN106227851B (en) | The image search method of depth of seam division search based on depth convolutional neural networks | |
CN107679078B (en) | Bayonet image vehicle rapid retrieval method and system based on deep learning | |
CN111666843B (en) | A Pedestrian Re-Identification Method Based on Global Feature and Local Feature Splicing | |
Zhang et al. | Three-dimensional densely connected convolutional network for hyperspectral remote sensing image classification | |
CN107679250A (en) | A kind of multitask layered image search method based on depth own coding convolutional neural networks | |
CN114255403B (en) | Optical remote sensing image data processing method and system based on deep learning | |
CN106407352A (en) | Traffic image retrieval method based on depth learning | |
Tian et al. | Small object detection via dual inspection mechanism for UAV visual images | |
CN111709311A (en) | A pedestrian re-identification method based on multi-scale convolutional feature fusion | |
Zheng et al. | Differential Learning: A Powerful Tool for Interactive Content-Based Image Retrieval. | |
CN106384100A (en) | Component-based fine vehicle model recognition method | |
CN109033944B (en) | An all-sky aurora image classification and key local structure localization method and system | |
CN102567483A (en) | Multi-feature fusion human face image searching method and system | |
CN103714148B (en) | SAR image search method based on sparse coding classification | |
CN106682681A (en) | Recognition algorithm automatic improvement method based on relevance feedback | |
CN105740378A (en) | Digital pathology whole slice image retrieval method | |
Seetharaman et al. | Statistical distributional approach for scale and rotation invariant color image retrieval using multivariate parametric tests and orthogonality condition | |
Li et al. | Incremental learning of infrared vehicle detection method based on SSD | |
CN108960005A (en) | The foundation and display methods, system of subjects visual label in a kind of intelligent vision Internet of Things | |
Tsapanos et al. | Shape matching using a binary search tree structure of weak classifiers | |
CN105844299B (en) | An image classification method based on bag-of-words model | |
Zhang et al. | Object detection of VisDrone by stronger feature extraction FasterRCNN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |