[go: up one dir, main page]

CN103678610A - Method for recognizing object based on intelligent mobile phone sensor - Google Patents

Method for recognizing object based on intelligent mobile phone sensor Download PDF

Info

Publication number
CN103678610A
CN103678610A CN201310690339.4A CN201310690339A CN103678610A CN 103678610 A CN103678610 A CN 103678610A CN 201310690339 A CN201310690339 A CN 201310690339A CN 103678610 A CN103678610 A CN 103678610A
Authority
CN
China
Prior art keywords
camera
picture
fov
candidate
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310690339.4A
Other languages
Chinese (zh)
Inventor
寿黎但
陈珂
陈刚
胡天磊
彭湃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201310690339.4A priority Critical patent/CN103678610A/en
Publication of CN103678610A publication Critical patent/CN103678610A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于智能手机传感器的物体识别方法,该发明充分利用了智能手机丰富的传感器参数,其中包括GPS定位,摄像头,摄像头参数等,提出了基于地理空间位置的概率FOV模型以及相关的剪枝策略和基于视觉空间的相似度度量方法。通过多模态结合的方式,本发明的方法能够正确的识别出用户查询的物体。

Figure 201310690339

The invention discloses an object recognition method based on a smart phone sensor. The invention makes full use of the rich sensor parameters of the smart phone, including GPS positioning, camera, camera parameters, etc., and proposes a probabilistic FOV model based on geographic space position and related The pruning strategy and similarity measure method based on visual space. By means of multi-modal combination, the method of the present invention can correctly identify the object queried by the user.

Figure 201310690339

Description

A kind of object identification method based on smart mobile phone sensor
Technical field
The present invention relates to spatial data index, the sparse coding in image recognition and retrieval and signal process field, relates in particular to a kind of object identification method based on smart mobile phone sensor.
Background technology
Spatial data refers to the data for the position of representation space entity, shape, size and all multi-aspect informations of distribution characteristics, and spatial data is mainly used in Geographic Information System (GIS) field at present.In the use of spatial data, need the space arithmetic operations such as a large amount of inquiries, insertion, deletion, therefore spatial data index seems most important efficiently.R tree is a kind of space index structure being most widely used at present, and R tree is that B sets the expansion in hyperspace, is a kind of tree construction of balance.R tree construction adopts the minimum border rectangle that is parallel to data space axle to be similar to complicated spatial object, and its major advantage is to represent a complicated object by the byte of some.Even now can be lost a lot of information, but the minimum boundary rectangle of space object has retained the most important geometrical property of object, i.e. the position of space object and its scope in whole coordinate axis.
At image recognition and searching field, vision word (Bag-of-visual-words) be a kind of the most frequently used be also the method that effect is outstanding.The method is inspired by word bag (Bag-of-words) model in text retrieval field, and in off-line phase, by the local feature point cluster of a large amount of pictures, these cluster centres are vision word.For new picture, only the local feature detecting need to be projected to these goes above vision word obtaining in advance, just a sub-picture is represented to become the proper vector of a vision word, the dimension of vector is exactly the size of vision word, the implication of the value in each dimension is in this image, the frequency that this vision word occurs.Once by image vector, the calculating of the similarity in vector model so (as cosine distance) just can be suitable for.By using k-Nearest Neighbor Classifier or support vector machine to learn existing training sample, just can identify new image category.
Sparse coding originates from Neuscience the earliest, and neurophysiologist has launched comprehensive deep research on vision system, and has obtained some significant achievements in research. and this just makes to utilize computing machine to come analog vision system to become possibility in engineering.Based on this understanding, utilize existing biology scientific payoffs, contact signal processing, the theory of computation and information theory knowledge, by vision system is carried out to microcomputer modelling, make to calculate the vision system that function is simulated people to a certain extent, a difficult problem of encountering to solve artificial intelligence in image processing field.The basic assumption of sparse coding is to have a series of base vector (basis vector), so for input signal (vector) arbitrarily, its expression can both be become to the linear combination of these base vectors, wherein, it is non-zero that the coefficient of these linear combinations only has a few items, and these coefficients are " sparse coding ".
Summary of the invention
The object of the invention is to for the deficiencies in the prior art, a kind of object identification method based on smart mobile phone sensor is provided.
The technical scheme that the present invention solves its technical matters employing is as follows: a kind of object identification method based on smart mobile phone sensor, it is characterized in that, and the step of the method is as follows:
(1) user submits an object query demand on smart mobile phone, comprises gps coordinate, camera FOV parameter, the pictorial information that camera captures;
(2) by gps coordinate, initiate spatial index R tree query, obtain the picture set of spatial neighbors;
(3) in above-mentioned picture set, gather gps coordinate and camera FOV parameter is set up probability FOV model, this model has been considered the uncertainty of GPS location and camera parameter, can estimate the probability that certain object is captured by camera; Described probability FOV model is;
∫ Q e - | | cθ | | 2 2 σ 1 2 · e - | | d | | 2 2 σ 2 2 dq ;
In formula, Q is the probabilistic border circular areas of above-mentioned GPS, and q is the unit area in Q, and d and θ are respectively object and surface distance and the deviation angle of this object from camera, σ 1with σ 2respectively empirical constant, c=1;
(4) by the foundation of probability FOV model, the Pruning strategy based on this model is further proposed, described Pruning strategy is: for an inquiry, the object outside probability FOV example collection corresponding to inquiry is candidate's object scarcely; The relevant picture of those objects that may be caught in hardly from geographical geometric space is by deleted, and then the scale quantity of having dwindled candidate's pictures;
(5) calculate the visual signature similarity of inquiry picture and candidate's pictures, this measuring similarity space is the angle from signal reconstruction, solves the sparse coding of inquiry picture on candidate's pictures;
(6) calculate the vision similarity of inquiry picture and object, for the picture analogies degree of same object, superpose, and then normalization has just obtained the similarity value with object;
(7), by the method integrating step 5 and 6 of ballot, obtain the comprehensive evaluation mark of candidate's object;
(8) to candidate's object comprehensive evaluation mark sequence in step 7, the object that mark is the highest is net result.
The beneficial effect that the present invention has is: by internet (picture sharing website etc.) upper a large amount of photos are set up to simple feature extraction and geospatial location index, the inquiry (comprising gps coordinate, camera FOV parameter and picture) that can submit at smart mobile phone end according to user identifies captured object.Probability FOV model based on geospatial location and relevant Pruning strategy and the method for measuring similarity based on visual space can make recognizer efficient and accurate.
Accompanying drawing explanation
Fig. 1 is the invention process flow chart of steps.
Embodiment
Now in conjunction with concrete enforcement and example, technical scheme of the present invention is described further.
As Fig. 1, specific implementation process of the present invention and principle of work are as follows:
Step 1: user initiates an inquiry at smart mobile phone end APP, specifically, user has taken a photo facing to the outward appearance of certain object, and APP records gps coordinate now simultaneously, camera FOV parameter (comprise camera towards etc.) and the picture of taking;
Step 2: in off-line phase, spatial data index (as R tree) is set up in picture geographic position in database, every pictures has extracted visual feature vector by vision word model, every pictures correspondence geographical space coordinate and visual feature vector, and relevant object label;
For the online inquiry of submitting to of user, the gps coordinate of submission is submitted to and in existing spatial data index, carries out a site polling, return and obtain inquiring about near the picture set of coordinate points, these pictures synthesize candidate's pictures, and the collection of objects that picture is corresponding becomes candidate's object collection;
Step 3: the Pruning strategy of probability FOV model will further dwindle the scale of above-mentioned candidate's collection of objects is deleted the object that may be captured by camera hardly in those spaces, geographic position from set, and so corresponding pictures are also by deleted.
Traditional FOV model has comprised 4 parameters, the position of camera, camera towards, the maximum visual distance of the subtended angle of camera and camera, these parameters can both be easy to get on the sensor of smart mobile phone, and be projected on two dimensional surface be one fan-shaped.Yet, due to reasons such as device measurings, the coordinate of GPS location is as a rule very inaccurate, and in general the error of civilian GPS positioning equipment is 50 meters of left and right, the gps coordinate of measuring so has just had uncertainty, and traditional FOV model is not considered this uncertainty.
In probability FOV model, introduce a variable r and controlled the uncertainty that GPS locates, this range of indeterminacy is one and take gps coordinate measured value as the center of circle, the circle that r is radius, show actual gps coordinate value may be in this border circular areas any one position.In addition, the probability distribution that the some objects of probability FOV model assumption are captured by camera is relevant from surface distance and the deviation angle of camera with this object, and be Gaussian distribution, thereby the i.e. integration of probability distribution function on border circular areas for this reason of the probability that captured by camera in this uncertain region of this object, this value is the possibility size that camera is caught in geographical geometric space:
∫ Q e - | | cθ | | 2 2 σ 1 2 · e - | | d | | 2 2 σ 2 2 dq ;
In formula, Q is the probabilistic border circular areas of above-mentioned GPS, and q is the unit area in Q, and d and θ are respectively object and surface distance and the deviation angle of this object from camera, σ 1with σ 2respectively empirical constant, due to σ 1variation in fact can affect θ, omit so c and also lose generally, get c=1 here.
Step 4: introduce the Pruning strategy based on above-mentioned probability FOV model below.According to the uncertain phenomenon of GPS mentioned above, can see for any one object and appearing in certain inquiry, necessarily exist some probability FOV examples (fan-shaped) to comprise this object.In other words, all possible probability FOV example collection must cover all possible candidate's object.Therefore,, for an inquiry, the object outside probability FOV example collection corresponding to inquiry is candidate's object scarcely.The relevant picture of object that the proposition of Pruning strategy may be caught in hardly those from geographical geometric space is by deleted, and then the scale quantity of having dwindled candidate's pictures, for the calculating of follow-up visual signature has reduced unnecessary calculation cost, improved the performance of total system.
Step 5: calculate the vision similarity of all objects in inquiry picture and above-mentioned steps, mean from visual angle, the inquiry picture that user submits to is the most similar with which object.
Traditional picture analogies degree is all the Similarity Measures in vector space model, as cosine distance, Euclidean distance etc., yet the problem of this class methods maximum is the value of similarity between picture and picture and do not have discrimination, this is that the proper vector due to picture is that higher-dimension and sparse characteristic cause.In order to overcome this problem, the Similarity Measures based on sparse coding in signal process field has been proposed.
The basic thought of sparse coding is, some base vectors in given signal space, also claim former subvector, arbitrary input can be represented to become the linear combination of these base vectors, the coefficient of these linear combinations is coding, wherein only having a few coefficients is nonzero term, and most of coefficients are zero, namely so-called " sparse ".Generally, base vector need to have a large amount of sample trainings to obtain, in the method, can regard the candidate's picture set in step (3) as base vector, problem just transforms into so: the proper vector of more given pictures, can for a new input picture feature vector, by the linear combination of these pictures, reconstruct this input picture? and the basic problem of Here it is sparse coding, the coefficient solving is the similarity of input picture and this picture.The angle of processing from signal, can be understood as, in order to reconstruct input picture, the contribution degree that this picture has been done.
Step 6: by the vision similarity that after the similarity stack of same object picture, also normalization has just obtained inquiring about picture and this object, this metric has been weighed from image content itself and which object more approaching.
Step 7: for step (3) and the resulting probability being captured by camera from certain object of geographical space angle of step (4) with from the similarity value of visual signature angle inquiry picture and this object, the two weighting has just been obtained to the comprehensive evaluation mark of certain object.Introducing weight variable λ can control evaluation score and more tend to geographical space aspect or visual signature aspect, and this variable is called balance factor.
In the present invention, balance factor is an adjustable parameter, if balance factor is zero, can think that recognizer only used geographical geometric space information and ignored the information of visual signature, if 1 can be thought and only used visual signature information and ignored geospatial information.Apparently, in the relatively sparse region of object, the probability FOV model of pure geographical geometric space can larger area separate the probability that different objects is captured by camera, thereby now can see calculation cost little without the similarity in computation vision space.On the contrary, in the relatively dense area of object, probability FOV model just no longer has obvious discrimination, need to improve the similarity weight of visual signature and cover the shortage.
Step 8: obtained the comprehensive evaluation mark of each object in candidate's collection of objects in step (5), after this mark sequence, maximal value is the object being captured by user's camera, thereby has completed the work of automatic identification.

Claims (1)

1.一种基于智能手机传感器的物体识别方法,其特征在于,该方法的步骤如下:1. an object recognition method based on smart phone sensor, it is characterized in that, the steps of the method are as follows: (1)用户在智能手机上提交一个物体查询需求,包括GPS坐标,相机FOV参数,摄像头捕捉到的图片信息。(1) The user submits an object query request on the smartphone, including GPS coordinates, camera FOV parameters, and image information captured by the camera. (2)通过GPS坐标发起一个空间索引R树查询,获得空间近邻的图片集合;(2) Initiate a spatial index R-tree query through GPS coordinates to obtain a collection of pictures of spatial neighbors; (3)在上述图片集合中集合GPS坐标和相机FOV参数建立概率FOV模型,该模型考虑到了GPS定位和相机参数的不确定性,可以估计出某个物体被相机捕捉到的概率;所述概率FOV模型为。(3) Set up the GPS coordinates and camera FOV parameters in the above picture collection to establish a probabilistic FOV model, which takes into account the uncertainty of GPS positioning and camera parameters, and can estimate the probability of an object being captured by the camera; the probability The FOV model is . ∫∫ QQ ee -- || || cθcθ || || 22 22 σσ 11 22 ·· ee -- || || dd || || 22 22 σσ 22 22 dqdq ;; 式中,Q为上述GPS不确定性的圆形区域,q是Q中的单位面积,d和θ分别是某物体和该物体离相机的地表距离以及偏移角度,σ1与σ2分别是经验常数,c=1。In the formula, Q is the circular area of the above-mentioned GPS uncertainty, q is the unit area in Q, d and θ are the surface distance and the offset angle of an object and the object from the camera, respectively, σ 1 and σ 2 are Empirical constant, c=1. (4)通过概率FOV模型的建立,进一步提出基于该模型的剪枝策略,所述剪枝策略为:对于一个查询,在查询对应的概率FOV实例集合之外的物体一定不是候选物体;那些从地理几何空间中几乎不可能被捕捉到的物体相关的图片将被删除,进而缩小了候选图片集的规模数量。(4) Through the establishment of the probabilistic FOV model, a pruning strategy based on the model is further proposed. The pruning strategy is: for a query, objects outside the set of probabilistic FOV instances corresponding to the query must not be candidate objects; Images related to objects that are nearly impossible to capture in geogeometric space are removed, thereby reducing the size of the candidate image set. (5)计算查询图片和候选图片集的视觉特征相似度,该相似度度量空间是从信号重建的角度出发,求解出查询图片在候选图片集上的稀疏编码。(5) Calculate the visual feature similarity between the query picture and the candidate picture set. The similarity measurement space is from the perspective of signal reconstruction to solve the sparse coding of the query picture on the candidate picture set. (6)计算查询图片和物体的视觉相似度,对于同一物体的图片相似度进行叠加,然后再归一化就得到了与物体的相似度量值。(6) Calculate the visual similarity between the query image and the object, superimpose the image similarity of the same object, and then normalize to obtain the similarity measure with the object. (7)通过投票的方法结合步骤5和6,得到候选物体的综合评价分数。(7) Combining steps 5 and 6 through the voting method to obtain the comprehensive evaluation score of the candidate object. (8)对步骤7中候选物体综合评价分数排序,分数最高的物体即为最终结果。(8) Rank the comprehensive evaluation scores of the candidate objects in step 7, and the object with the highest score is the final result.
CN201310690339.4A 2013-12-16 2013-12-16 Method for recognizing object based on intelligent mobile phone sensor Pending CN103678610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310690339.4A CN103678610A (en) 2013-12-16 2013-12-16 Method for recognizing object based on intelligent mobile phone sensor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310690339.4A CN103678610A (en) 2013-12-16 2013-12-16 Method for recognizing object based on intelligent mobile phone sensor

Publications (1)

Publication Number Publication Date
CN103678610A true CN103678610A (en) 2014-03-26

Family

ID=50316155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310690339.4A Pending CN103678610A (en) 2013-12-16 2013-12-16 Method for recognizing object based on intelligent mobile phone sensor

Country Status (1)

Country Link
CN (1) CN103678610A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870961A (en) * 2016-09-23 2018-04-03 李雨暹 Method and system for searching and sorting space objects and computer readable storage device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708380A (en) * 2012-05-08 2012-10-03 东南大学 Indoor common object identification method based on machine vision
CN102819752A (en) * 2012-08-16 2012-12-12 北京理工大学 System and method for outdoor large-scale object recognition based on distributed inverted files
CN102831405A (en) * 2012-08-16 2012-12-19 北京理工大学 Method and system for outdoor large-scale object identification on basis of distributed and brute-force matching

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708380A (en) * 2012-05-08 2012-10-03 东南大学 Indoor common object identification method based on machine vision
CN102819752A (en) * 2012-08-16 2012-12-12 北京理工大学 System and method for outdoor large-scale object recognition based on distributed inverted files
CN102831405A (en) * 2012-08-16 2012-12-19 北京理工大学 Method and system for outdoor large-scale object identification on basis of distributed and brute-force matching

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAI PENG 等,: ""The Konwing Camera: Recognizing Places-of-Interest in Smartphone photos"", 《PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIS CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL.ACM》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870961A (en) * 2016-09-23 2018-04-03 李雨暹 Method and system for searching and sorting space objects and computer readable storage device
CN107870961B (en) * 2016-09-23 2020-09-18 李雨暹 Method and system for searching and sorting space objects and computer readable storage device

Similar Documents

Publication Publication Date Title
CN111199564B (en) Indoor positioning method and device of intelligent mobile terminal and electronic equipment
Ranganathan et al. Towards illumination invariance for visual localization
CN107133325B (en) A geospatial location method for internet photos based on street view map
CN111046125A (en) Visual positioning method, system and computer readable storage medium
WO2020224305A1 (en) Method and apparatus for device positioning, and device
CN102034101B (en) Method for quickly positioning circular mark in PCB visual detection
CN109658445A (en) Network training method, increment build drawing method, localization method, device and equipment
JP5385105B2 (en) Image search method and system
CN110704712A (en) Recognition method and system of scene picture shooting location range based on image retrieval
CN108961330A (en) The long measuring method of pig body and system based on image
CN103489191B (en) A kind of remote sensing images well-marked target change detecting method
CN114241464A (en) Cross-view image real-time matching geolocation method and system based on deep learning
Wu et al. Accurate smartphone indoor visual positioning based on a high-precision 3D photorealistic map
CN102938075A (en) RVM (relevant vector machine) method for maximum wind radius and typhoon eye dimension modeling
Vishal et al. Accurate localization by fusing images and GPS signals
CN112258580A (en) Visual SLAM loop detection method based on deep learning
CN105045841B (en) With reference to gravity sensor and the characteristics of image querying method of image characteristic point angle
CN108537101A (en) A kind of pedestrian's localization method based on state recognition
Song et al. A handheld device for measuring the diameter at breast height of individual trees using laser ranging and deep-learning based image recognition
Wu et al. An efficient visual loop closure detection method in a map of 20 million key locations
Zhang et al. Topological spatial verification for instance search
Xue et al. A fast visual map building method using video stream for visual-based indoor localization
CN109739830B (en) Position fingerprint database rapid construction method based on crowdsourcing data
CN103678610A (en) Method for recognizing object based on intelligent mobile phone sensor
Zhao et al. CrowdOLR: Toward object location recognition with crowdsourced fingerprints using smartphones

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140326

WD01 Invention patent application deemed withdrawn after publication