[go: up one dir, main page]

skip to main content
10.1145/3269206.3271691acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

PSLSH: An Index Structure for Efficient Execution of Set Queries in High-Dimensional Spaces

Published: 17 October 2018 Publication History

Abstract

Efficient implementations of range and nearest neighbor queries are critical in many large multimedia applications. Locality Sensitive Hashing (LSH) is a popular technique for performing approximate searches in high-dimensional multimedia, such as image or sensory data. Often times, these multimedia data are represented as a collection of important spatio-temporal features which are extracted by using localized feature extraction algorithms. When a user wants to search for a given entity (object, event, or observation), individual similarity search queries, which collectively form a set query, need to be performed on the features that represent the particular search entity. Existing LSH techniques require that users provide an accuracy guarantee for each query in the set query, instead of an overall guarantee for the entire set query, which can lead to misses or wasteful work. We propose a novel index structure, Point Set LSH (PSLSH), which is able to execute a similarity search for a given set of search points in the high-dimensional space with a user-provided guarantee for the entire set query. Experimental evaluation shows significant gains in efficiency and accuracy trade-offs for executing set queries in high-dimensional spaces.

References

[1]
P. Indyk, et al. Approximate nearest neighbors: Towards removing the curse of dimensionality. STOC 1998.
[2]
M. Datar, et al. Locality-sensitive hashing scheme based on p-stable distributions. SCG 2004.
[3]
Q. Lv, et al. Multi-probe lsh: Efficient indexing for high-dimensional similarity search. PVLDB 2007.
[4]
O. Maron, et al. A Framework for Multiple-instance Learning. NIPS 1997.
[5]
W. Zhang, et al. Data-oriented locality sensitive hashing. MM 2010.
[6]
J. Gao, et al. DSH: Data Sensitive Hashing for High-dimensional K-nnsearch. SIGMOD 2014.
[7]
Y. Sun, et al. SRS: Solving C-approximate Nearest Neighbor Queries in High Dimensional Euclidean Space with a Tiny Index. PVLDB 2014.
[8]
A. Andoni, et al. Optimal Data-Dependent Hashing for Approximate Near Neighbors. STOC 2015.
[9]
A. Joly, et al. A posteriori multi-probe locality sensitive hashing. MM 2008.
[10]
Y. Liu, et al. Sk-lsh: An efficient index structure for approximate nearest neighbor search. PVLDB 2014.
[11]
M. Bawa, et al. Lsh forest: Self-tuning indexes for similarity search. WWW 2005.
[12]
J. Gan, et al. Locality-sensitive hashing scheme based on dynamic collision counting. SIGMOD 2012.
[13]
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV 2004.
[14]
H. Bay, et al. Surf: Speeded up robust features. ECCV 2006.
[15]
J. Wang, et al. Hashing for similarity search: A survey. arXiv preprint arXiv:1408.2927, 2014.
[16]
Q. Huang, et al. Query-aware locality-sensitive hashing for approximate nearest neighbor search. PVLDB 2015.
[17]
Y. Tao, et al. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. TODS 2010.
[18]
Y. Zheng, et al. Lazylsh: Approximate nearest neighbor search for multiple distance functions with a single index. SIGMOD 2016.
[19]
A. Dasgupta, et al. Fast locality-sensitive hashing. SIGKDD 2011.
[20]
J. Gao, et al. Selective Hashing: Closing the Gap between Radius Search and k-NN Search. SIGKDD 2015.
[21]
S. Edlund, et al. The spatiotemporal epidemiological modeler. IHI 2010.
[22]
S. Liu, et al. epidms: Data management and analytics for decision-making from epidemic spread simulation ensembles. The Journal of Infectious Diseases 2016.

Cited By

View all
  • (2024)SLoSH: Set Locality Sensitive Hashing via Sliced-Wasserstein Embeddings2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00255(2554-2564)Online publication date: 3-Jan-2024
  • (2024)A Trajectory-oriented Locality-sensitive Hashing Method for User IdentificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3324427(1-14)Online publication date: 2024
  • (2020)Efficient Bitmap-based Indexing and Retrieval of Similarity Search Image Queries2020 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI)10.1109/SSIAI49293.2020.9094616(58-61)Online publication date: Mar-2020
  • Show More Cited By

Index Terms

  1. PSLSH: An Index Structure for Efficient Execution of Set Queries in High-Dimensional Spaces

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
    October 2018
    2362 pages
    ISBN:9781450360142
    DOI:10.1145/3269206
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. locality sensitive hashing
    2. nearest-neighbor search

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '18
    Sponsor:

    Acceptance Rates

    CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 09 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SLoSH: Set Locality Sensitive Hashing via Sliced-Wasserstein Embeddings2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00255(2554-2564)Online publication date: 3-Jan-2024
    • (2024)A Trajectory-oriented Locality-sensitive Hashing Method for User IdentificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3324427(1-14)Online publication date: 2024
    • (2020)Efficient Bitmap-based Indexing and Retrieval of Similarity Search Image Queries2020 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI)10.1109/SSIAI49293.2020.9094616(58-61)Online publication date: Mar-2020
    • (2020)mmLSH: A Practical and Efficient Technique for Processing Approximate Nearest Neighbor Queries on Multimedia DataSimilarity Search and Applications10.1007/978-3-030-60936-8_4(47-61)Online publication date: 14-Oct-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media