CN103678610A - Method for recognizing object based on intelligent mobile phone sensor - Google Patents
Method for recognizing object based on intelligent mobile phone sensor Download PDFInfo
- Publication number
- CN103678610A CN103678610A CN201310690339.4A CN201310690339A CN103678610A CN 103678610 A CN103678610 A CN 103678610A CN 201310690339 A CN201310690339 A CN 201310690339A CN 103678610 A CN103678610 A CN 103678610A
- Authority
- CN
- China
- Prior art keywords
- camera
- picture
- fov
- candidate
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 230000000007 visual effect Effects 0.000 claims abstract description 15
- 238000013138 pruning Methods 0.000 claims abstract description 9
- 238000011524 similarity measure Methods 0.000 claims abstract description 4
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000005259 measurement Methods 0.000 claims 1
- 239000013598 vector Substances 0.000 description 18
- 238000012545 processing Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于智能手机传感器的物体识别方法,该发明充分利用了智能手机丰富的传感器参数,其中包括GPS定位,摄像头,摄像头参数等,提出了基于地理空间位置的概率FOV模型以及相关的剪枝策略和基于视觉空间的相似度度量方法。通过多模态结合的方式,本发明的方法能够正确的识别出用户查询的物体。
The invention discloses an object recognition method based on a smart phone sensor. The invention makes full use of the rich sensor parameters of the smart phone, including GPS positioning, camera, camera parameters, etc., and proposes a probabilistic FOV model based on geographic space position and related The pruning strategy and similarity measure method based on visual space. By means of multi-modal combination, the method of the present invention can correctly identify the object queried by the user.
Description
Technical field
The present invention relates to spatial data index, the sparse coding in image recognition and retrieval and signal process field, relates in particular to a kind of object identification method based on smart mobile phone sensor.
Background technology
Spatial data refers to the data for the position of representation space entity, shape, size and all multi-aspect informations of distribution characteristics, and spatial data is mainly used in Geographic Information System (GIS) field at present.In the use of spatial data, need the space arithmetic operations such as a large amount of inquiries, insertion, deletion, therefore spatial data index seems most important efficiently.R tree is a kind of space index structure being most widely used at present, and R tree is that B sets the expansion in hyperspace, is a kind of tree construction of balance.R tree construction adopts the minimum border rectangle that is parallel to data space axle to be similar to complicated spatial object, and its major advantage is to represent a complicated object by the byte of some.Even now can be lost a lot of information, but the minimum boundary rectangle of space object has retained the most important geometrical property of object, i.e. the position of space object and its scope in whole coordinate axis.
At image recognition and searching field, vision word (Bag-of-visual-words) be a kind of the most frequently used be also the method that effect is outstanding.The method is inspired by word bag (Bag-of-words) model in text retrieval field, and in off-line phase, by the local feature point cluster of a large amount of pictures, these cluster centres are vision word.For new picture, only the local feature detecting need to be projected to these goes above vision word obtaining in advance, just a sub-picture is represented to become the proper vector of a vision word, the dimension of vector is exactly the size of vision word, the implication of the value in each dimension is in this image, the frequency that this vision word occurs.Once by image vector, the calculating of the similarity in vector model so (as cosine distance) just can be suitable for.By using k-Nearest Neighbor Classifier or support vector machine to learn existing training sample, just can identify new image category.
Sparse coding originates from Neuscience the earliest, and neurophysiologist has launched comprehensive deep research on vision system, and has obtained some significant achievements in research. and this just makes to utilize computing machine to come analog vision system to become possibility in engineering.Based on this understanding, utilize existing biology scientific payoffs, contact signal processing, the theory of computation and information theory knowledge, by vision system is carried out to microcomputer modelling, make to calculate the vision system that function is simulated people to a certain extent, a difficult problem of encountering to solve artificial intelligence in image processing field.The basic assumption of sparse coding is to have a series of base vector (basis vector), so for input signal (vector) arbitrarily, its expression can both be become to the linear combination of these base vectors, wherein, it is non-zero that the coefficient of these linear combinations only has a few items, and these coefficients are " sparse coding ".
Summary of the invention
The object of the invention is to for the deficiencies in the prior art, a kind of object identification method based on smart mobile phone sensor is provided.
The technical scheme that the present invention solves its technical matters employing is as follows: a kind of object identification method based on smart mobile phone sensor, it is characterized in that, and the step of the method is as follows:
(1) user submits an object query demand on smart mobile phone, comprises gps coordinate, camera FOV parameter, the pictorial information that camera captures;
(2) by gps coordinate, initiate spatial index R tree query, obtain the picture set of spatial neighbors;
(3) in above-mentioned picture set, gather gps coordinate and camera FOV parameter is set up probability FOV model, this model has been considered the uncertainty of GPS location and camera parameter, can estimate the probability that certain object is captured by camera; Described probability FOV model is;
In formula, Q is the probabilistic border circular areas of above-mentioned GPS, and q is the unit area in Q, and d and θ are respectively object and surface distance and the deviation angle of this object from camera, σ
1with σ
2respectively empirical constant, c=1;
(4) by the foundation of probability FOV model, the Pruning strategy based on this model is further proposed, described Pruning strategy is: for an inquiry, the object outside probability FOV example collection corresponding to inquiry is candidate's object scarcely; The relevant picture of those objects that may be caught in hardly from geographical geometric space is by deleted, and then the scale quantity of having dwindled candidate's pictures;
(5) calculate the visual signature similarity of inquiry picture and candidate's pictures, this measuring similarity space is the angle from signal reconstruction, solves the sparse coding of inquiry picture on candidate's pictures;
(6) calculate the vision similarity of inquiry picture and object, for the picture analogies degree of same object, superpose, and then normalization has just obtained the similarity value with object;
(7), by the method integrating step 5 and 6 of ballot, obtain the comprehensive evaluation mark of candidate's object;
(8) to candidate's object comprehensive evaluation mark sequence in step 7, the object that mark is the highest is net result.
The beneficial effect that the present invention has is: by internet (picture sharing website etc.) upper a large amount of photos are set up to simple feature extraction and geospatial location index, the inquiry (comprising gps coordinate, camera FOV parameter and picture) that can submit at smart mobile phone end according to user identifies captured object.Probability FOV model based on geospatial location and relevant Pruning strategy and the method for measuring similarity based on visual space can make recognizer efficient and accurate.
Accompanying drawing explanation
Fig. 1 is the invention process flow chart of steps.
Embodiment
Now in conjunction with concrete enforcement and example, technical scheme of the present invention is described further.
As Fig. 1, specific implementation process of the present invention and principle of work are as follows:
Step 1: user initiates an inquiry at smart mobile phone end APP, specifically, user has taken a photo facing to the outward appearance of certain object, and APP records gps coordinate now simultaneously, camera FOV parameter (comprise camera towards etc.) and the picture of taking;
Step 2: in off-line phase, spatial data index (as R tree) is set up in picture geographic position in database, every pictures has extracted visual feature vector by vision word model, every pictures correspondence geographical space coordinate and visual feature vector, and relevant object label;
For the online inquiry of submitting to of user, the gps coordinate of submission is submitted to and in existing spatial data index, carries out a site polling, return and obtain inquiring about near the picture set of coordinate points, these pictures synthesize candidate's pictures, and the collection of objects that picture is corresponding becomes candidate's object collection;
Step 3: the Pruning strategy of probability FOV model will further dwindle the scale of above-mentioned candidate's collection of objects is deleted the object that may be captured by camera hardly in those spaces, geographic position from set, and so corresponding pictures are also by deleted.
Traditional FOV model has comprised 4 parameters, the position of camera, camera towards, the maximum visual distance of the subtended angle of camera and camera, these parameters can both be easy to get on the sensor of smart mobile phone, and be projected on two dimensional surface be one fan-shaped.Yet, due to reasons such as device measurings, the coordinate of GPS location is as a rule very inaccurate, and in general the error of civilian GPS positioning equipment is 50 meters of left and right, the gps coordinate of measuring so has just had uncertainty, and traditional FOV model is not considered this uncertainty.
In probability FOV model, introduce a variable r and controlled the uncertainty that GPS locates, this range of indeterminacy is one and take gps coordinate measured value as the center of circle, the circle that r is radius, show actual gps coordinate value may be in this border circular areas any one position.In addition, the probability distribution that the some objects of probability FOV model assumption are captured by camera is relevant from surface distance and the deviation angle of camera with this object, and be Gaussian distribution, thereby the i.e. integration of probability distribution function on border circular areas for this reason of the probability that captured by camera in this uncertain region of this object, this value is the possibility size that camera is caught in geographical geometric space:
In formula, Q is the probabilistic border circular areas of above-mentioned GPS, and q is the unit area in Q, and d and θ are respectively object and surface distance and the deviation angle of this object from camera, σ
1with σ
2respectively empirical constant, due to σ
1variation in fact can affect θ, omit so c and also lose generally, get c=1 here.
Step 4: introduce the Pruning strategy based on above-mentioned probability FOV model below.According to the uncertain phenomenon of GPS mentioned above, can see for any one object and appearing in certain inquiry, necessarily exist some probability FOV examples (fan-shaped) to comprise this object.In other words, all possible probability FOV example collection must cover all possible candidate's object.Therefore,, for an inquiry, the object outside probability FOV example collection corresponding to inquiry is candidate's object scarcely.The relevant picture of object that the proposition of Pruning strategy may be caught in hardly those from geographical geometric space is by deleted, and then the scale quantity of having dwindled candidate's pictures, for the calculating of follow-up visual signature has reduced unnecessary calculation cost, improved the performance of total system.
Step 5: calculate the vision similarity of all objects in inquiry picture and above-mentioned steps, mean from visual angle, the inquiry picture that user submits to is the most similar with which object.
Traditional picture analogies degree is all the Similarity Measures in vector space model, as cosine distance, Euclidean distance etc., yet the problem of this class methods maximum is the value of similarity between picture and picture and do not have discrimination, this is that the proper vector due to picture is that higher-dimension and sparse characteristic cause.In order to overcome this problem, the Similarity Measures based on sparse coding in signal process field has been proposed.
The basic thought of sparse coding is, some base vectors in given signal space, also claim former subvector, arbitrary input can be represented to become the linear combination of these base vectors, the coefficient of these linear combinations is coding, wherein only having a few coefficients is nonzero term, and most of coefficients are zero, namely so-called " sparse ".Generally, base vector need to have a large amount of sample trainings to obtain, in the method, can regard the candidate's picture set in step (3) as base vector, problem just transforms into so: the proper vector of more given pictures, can for a new input picture feature vector, by the linear combination of these pictures, reconstruct this input picture? and the basic problem of Here it is sparse coding, the coefficient solving is the similarity of input picture and this picture.The angle of processing from signal, can be understood as, in order to reconstruct input picture, the contribution degree that this picture has been done.
Step 6: by the vision similarity that after the similarity stack of same object picture, also normalization has just obtained inquiring about picture and this object, this metric has been weighed from image content itself and which object more approaching.
Step 7: for step (3) and the resulting probability being captured by camera from certain object of geographical space angle of step (4) with from the similarity value of visual signature angle inquiry picture and this object, the two weighting has just been obtained to the comprehensive evaluation mark of certain object.Introducing weight variable λ can control evaluation score and more tend to geographical space aspect or visual signature aspect, and this variable is called balance factor.
In the present invention, balance factor is an adjustable parameter, if balance factor is zero, can think that recognizer only used geographical geometric space information and ignored the information of visual signature, if 1 can be thought and only used visual signature information and ignored geospatial information.Apparently, in the relatively sparse region of object, the probability FOV model of pure geographical geometric space can larger area separate the probability that different objects is captured by camera, thereby now can see calculation cost little without the similarity in computation vision space.On the contrary, in the relatively dense area of object, probability FOV model just no longer has obvious discrimination, need to improve the similarity weight of visual signature and cover the shortage.
Step 8: obtained the comprehensive evaluation mark of each object in candidate's collection of objects in step (5), after this mark sequence, maximal value is the object being captured by user's camera, thereby has completed the work of automatic identification.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310690339.4A CN103678610A (en) | 2013-12-16 | 2013-12-16 | Method for recognizing object based on intelligent mobile phone sensor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310690339.4A CN103678610A (en) | 2013-12-16 | 2013-12-16 | Method for recognizing object based on intelligent mobile phone sensor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103678610A true CN103678610A (en) | 2014-03-26 |
Family
ID=50316155
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310690339.4A Pending CN103678610A (en) | 2013-12-16 | 2013-12-16 | Method for recognizing object based on intelligent mobile phone sensor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103678610A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107870961A (en) * | 2016-09-23 | 2018-04-03 | 李雨暹 | Method and system for searching and sorting space objects and computer readable storage device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708380A (en) * | 2012-05-08 | 2012-10-03 | 东南大学 | Indoor common object identification method based on machine vision |
CN102819752A (en) * | 2012-08-16 | 2012-12-12 | 北京理工大学 | System and method for outdoor large-scale object recognition based on distributed inverted files |
CN102831405A (en) * | 2012-08-16 | 2012-12-19 | 北京理工大学 | Method and system for outdoor large-scale object identification on basis of distributed and brute-force matching |
-
2013
- 2013-12-16 CN CN201310690339.4A patent/CN103678610A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708380A (en) * | 2012-05-08 | 2012-10-03 | 东南大学 | Indoor common object identification method based on machine vision |
CN102819752A (en) * | 2012-08-16 | 2012-12-12 | 北京理工大学 | System and method for outdoor large-scale object recognition based on distributed inverted files |
CN102831405A (en) * | 2012-08-16 | 2012-12-19 | 北京理工大学 | Method and system for outdoor large-scale object identification on basis of distributed and brute-force matching |
Non-Patent Citations (1)
Title |
---|
PAI PENG 等,: ""The Konwing Camera: Recognizing Places-of-Interest in Smartphone photos"", 《PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIS CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL.ACM》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107870961A (en) * | 2016-09-23 | 2018-04-03 | 李雨暹 | Method and system for searching and sorting space objects and computer readable storage device |
CN107870961B (en) * | 2016-09-23 | 2020-09-18 | 李雨暹 | Method and system for searching and sorting space objects and computer readable storage device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111199564B (en) | Indoor positioning method and device of intelligent mobile terminal and electronic equipment | |
Ranganathan et al. | Towards illumination invariance for visual localization | |
CN107133325B (en) | A geospatial location method for internet photos based on street view map | |
CN111046125A (en) | Visual positioning method, system and computer readable storage medium | |
WO2020224305A1 (en) | Method and apparatus for device positioning, and device | |
CN102034101B (en) | Method for quickly positioning circular mark in PCB visual detection | |
CN109658445A (en) | Network training method, increment build drawing method, localization method, device and equipment | |
JP5385105B2 (en) | Image search method and system | |
CN110704712A (en) | Recognition method and system of scene picture shooting location range based on image retrieval | |
CN108961330A (en) | The long measuring method of pig body and system based on image | |
CN103489191B (en) | A kind of remote sensing images well-marked target change detecting method | |
CN114241464A (en) | Cross-view image real-time matching geolocation method and system based on deep learning | |
Wu et al. | Accurate smartphone indoor visual positioning based on a high-precision 3D photorealistic map | |
CN102938075A (en) | RVM (relevant vector machine) method for maximum wind radius and typhoon eye dimension modeling | |
Vishal et al. | Accurate localization by fusing images and GPS signals | |
CN112258580A (en) | Visual SLAM loop detection method based on deep learning | |
CN105045841B (en) | With reference to gravity sensor and the characteristics of image querying method of image characteristic point angle | |
CN108537101A (en) | A kind of pedestrian's localization method based on state recognition | |
Song et al. | A handheld device for measuring the diameter at breast height of individual trees using laser ranging and deep-learning based image recognition | |
Wu et al. | An efficient visual loop closure detection method in a map of 20 million key locations | |
Zhang et al. | Topological spatial verification for instance search | |
Xue et al. | A fast visual map building method using video stream for visual-based indoor localization | |
CN109739830B (en) | Position fingerprint database rapid construction method based on crowdsourcing data | |
CN103678610A (en) | Method for recognizing object based on intelligent mobile phone sensor | |
Zhao et al. | CrowdOLR: Toward object location recognition with crowdsourced fingerprints using smartphones |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140326 |
|
WD01 | Invention patent application deemed withdrawn after publication |