[go: up one dir, main page]

skip to main content
research-article

Entity-Based Query Recommendation for Long-Tail Queries

Published: 22 August 2018 Publication History

Abstract

Query recommendation, which suggests related queries to search engine users, has attracted a lot of attention in recent years. Most of the existing solutions, which perform analysis of users’ search history (or query logs), are often insufficient for long-tail queries that rarely appear in query logs. To handle such queries, we study the use of entities found in queries to provide recommendations. Specifically, we extract entities from a query, and use these entities to explore new ones by consulting an information source. The discovered entities are then used to suggest new queries to the user. In this article, we examine two information sources: (1) a knowledge base (or KB), such as YAGO and Freebase; and (2) a click log, which contains the URLs accessed by a query user. We study how to use these sources to find new entities useful for query recommendation. We further study a hybrid framework that integrates different query recommendation methods effectively. As shown in the experiments, our proposed approaches provide better recommendations than existing solutions for long-tail queries. In addition, our query recommendation process takes less than 100ms to complete. Thus, our solution is suitable for providing online query recommendation services for search engines.

References

[1]
Ricardo A. Baeza-Yates, Carlos A. Hurtado, and Marcelo Mendoza. 2004. Query recommendation using query logs in search engines. In Current Trends in Database Technology -- EDBT Workshops. Springer, 588--596.
[2]
Ricardo A. Baeza-Yates and Alessandro Tiberi. 2007. Extracting semantic relations from query logs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’07). ACM, 76--85.
[3]
Ziv Bar-Yossef and Naama Kraus. 2011. Context-sensitive query auto-completion. In Proceedings of the 20th International Conference on World Wide Web. ACM, 107--116.
[4]
Roi Blanco, Berkant Barla Cambazoglu, Peter Mika, and Nicolas Torzec. 2013. Entity recommendations in web search. In International Semantic Web Conference. Springer, 33--48.
[5]
Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and space-efficient entity linking for queries. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. ACM, 179--188.
[6]
Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, and Sebastiano Vigna. 2008. The query-flow graph: Model and applications. In Proceedings of the 17th ACM conference on Information and knowledge management (CIKM’08). ACM, 609--618.
[7]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data, (SIGMOD’08). ACM, 1247--1250.
[8]
Francesco Bonchi, Raffaele Perego, Fabrizio Silvestri, Hossein Vahabi, and Rossano Venturini. 2012. Efficient query recommendations in the long tail via center-piece subgraphs. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR’12). ACM, 345--354.
[9]
Daniele Broccolo, Lorenzo Marcon, Franco Maria Nardini, Raffaele Perego, and Fabrizio Silvestri. 2012. Generating suggestions for queries in the long tail with an inverted index. Inf. Process. Manage. 48, 2 (2012), 326--339.
[10]
Fei Cai, Shangsong Liang, and Maarten de Rijke. 2014. Time-sensitive personalized query auto-completion. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM 14). ACM, 1599--1608.
[11]
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, and Hang Li. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). ACM, 875--883.
[12]
Jiefeng Cheng, Qin Liu, Zhenguo Li, Wei Fan, John CS Lui, and Cheng He. 2015. VENUS: Vertex-centric streamlined graph computation on a single PC. In Proceedings of the IEEE 31st International Conference on Data Engineering (ICDE’15). IEEE, 1131--1142.
[13]
Reynold Cheng, Zhipeng Huang, Yudian Zheng, Jing Yan, Ka Yu Wong, and Eddie Ng. 2017. Meta paths and meta structures: Analysing large heterogeneous information networks. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data. Springer, 3--7.
[14]
Xiao Cheng and Dan Roth. 2013. Relational inference for wikification. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1787--1796.
[15]
W. Bruce Croft, Donald Metzler, and Trevor Strohman. 2010. Search Engines: Information Retrieval in Practice. vol. 283. Addison-Wesley Reading.
[16]
Doug Downey, Susan T. Dumais, and Eric Horvitz. 2007. Heads and tails: Studies of web search with common and rare queries. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’07). ACM, 847--848.
[17]
Henry Feild and James Allan. 2013. Task-aware query recommendation. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’13). ACM, 83--92.
[18]
Bruno M. Fonseca, Paulo Braz Golgher, Edleno Silva de Moura, Bruno Pôssas, and Nivio Ziviani. 2003. Discovering search engine related queries using association rules. J. Web Eng. 2, 4 (2003), 215--227.
[19]
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12). 17--30.
[20]
Qi He, Daxin Jiang, Zhen Liao, Steven CH Hoi, Kuiyu Chang, Ee-Peng Lim, and Hang Li. 2009. Web query recommendation via sequential query prediction. In Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE’09). IEEE, 1443--1454.
[21]
Zhipeng Huang, Bogdan Cautis, Reynold Cheng, and Yudian Zheng. 2016. KB-enabled query recommendation for long-tail queries. In Proceedigs of the 25th ACM International Conference on Information and Knowledge Management (CIKM’16). 2107--2112.
[22]
Zhipeng Huang and Nikos Mamoulis. 2017. Location-aware query recommendation for search engines at scale. In International Symposium on Spatial and Temporal Databases. Springer, 203--220.
[23]
Zhipeng Huang, Yudian Zheng, Reynold Cheng, Yizhou Sun, Nikos Mamoulis, and Xiang Li. 2016. Meta structure: Computing relevance in large heterogeneous information networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, 1595--1604.
[24]
Alpa Jain, Umut Ozertem, and Emre Velipasaoglu. 2011. Synthesizing high utility suggestions for rare web search queries. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (). ACM, 805--814.
[25]
Shan Jiang, Yuening Hu, Changsung Kang, Tim Daly Jr, Dawei Yin, Yi Chang, and Chengxiang Zhai. 2016. Learning query and document relevance from a web-scale click graph. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’16). ACM, 185--194.
[26]
Ni Lao and William W. Cohen. 2010. Relational retrieval using a combination of path-constrained random walks. Mach. Learn. 81, 1 (2010), 53--67.
[27]
Qin Liu, Zhenguo Li, John Lui, and Jiefeng Cheng. 2016. PowerWalk: Scalable personalized pagerank via random walks with vertex-centric decomposition. In Proceedings of the 25th ACM conference on Information and knowledge management (CIKM’16). ACM, 195--204.
[28]
Farzaneh Mahdisoltani, Joanna Biega, and Fabian M. Suchanek. 2013. YAGO3: A knowledge base from multilingual wikipedias. In Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (CIDR’13).
[29]
Changping Meng, Reynold Cheng, Silviu Maniu, Pierre Senellart, and Wangda Zhang. 2015. Discovering meta-paths in large heterogeneous information networks. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). International World Wide Web Conferences Steering Committee, 754--764.
[30]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013), 3111--3119.
[31]
Umut Ozertem, Olivier Chapelle, Pinar Donmez, and Emre Velipasaoglu. 2012. Learning to suggest: A machine learning framework for ranking query suggestions. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). ACM, 25--34.
[32]
Patrick Pantel and Ariel Fuxman. 2011. Jigs and lures: Associating web queries with structured entities. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 83--92.
[33]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543. http://www.aclweb.org/anthology/D14-1162.
[34]
Shuyao Qi, Dingming Wu, and Nikos Mamoulis. 2016. Location aware keyword query suggestion based on document proximity. IEEE Trans. Knowl. Data Eng. 28, 1 (2016), 82--97.
[35]
Ridho Reinanda, Edgar Meij, and Maarten de Rijke. 2015. Mining, ranking and recommending entity aspects. In Proceedings of the 38th international ACM SIGIR conference on Research and development in information retrieval (SIGIR’15). ACM, 263--272.
[36]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2013. Learning to rank query suggestions for adhoc and diversity search. Inf. Retriev. 16, 4 (2013), 429--451.
[37]
Milad Shokouhi. 2013. Learning to personalize query auto-completion. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, 103--112.
[38]
Milad Shokouhi and Kira Radinsky. 2012. Time-sensitive query auto-completion. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). ACM, 601--610.
[39]
Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue Simonsen, and Jian-Yun Nie. 2015. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In Proceedings of the 24th ACM conference on Information and knowledge management (CIKM’15). ACM, 553--562.
[40]
Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A core of semantic knowledge. In Proceedings of the 16th international conference on World Wide Web (WWW’07). ACM, 697--706.
[41]
Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proc. VLDB Endow. 4, 11 (2011), 992--1003.
[42]
Idan Szpektor, Aristides Gionis, and Yoelle Maarek. 2011. Improving recommendation for long-tail queries via templates. In Proceedings of the 20th international conference on World wide web (WWW’11). ACM, 47--56.
[43]
Salvatore Trani, Diego Ceccarelli, Claudio Lucchese, Salvatore Orlando, and Raffaele Perego. 2014. Dexter 2.0: An open source tool for semantically enriching data. In Proceedings of the 2014 International Conference on Posters 8 Demonstrations Track - Volume 1272 (ISWC’14). 417--420.
[44]
Ji-Rong Wen, Jian-Yun Nie, and Hong-Jiang Zhang. 2001. Clustering user queries of a search engine. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, 162--168.
[45]
www-dbpedia 2011. DBpedia 3.7. Retrieved August 8, 2018 from https://wiki.dbpedia.org/data-set-37.
[46]
Zhiyong Zhang and Olfa Nasraoui. 2006. Mining search engine query logs for query recommendation. In Proceedings of the 15th International Conference on World Wide Web (WWW’06). ACM, 1039--1040.

Cited By

View all
  • (2024)Eliminating Negative Word Similarities for Measuring Document Distances: A Thoroughly Empirical Study on Word Mover’s DistanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.322233635:6(7936-7948)Online publication date: Jun-2024
  • (2024)Multi-level feature interaction for open knowledge base canonicalizationKnowledge-Based Systems10.1016/j.knosys.2024.112386303(112386)Online publication date: Nov-2024
  • (2024)A cooperative co-evolutionary genetic algorithm for query recommendationMultimedia Tools and Applications10.1007/s11042-023-15585-683:4(11461-11491)Online publication date: 1-Jan-2024
  • Show More Cited By

Index Terms

  1. Entity-Based Query Recommendation for Long-Tail Queries

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 6
    December 2018
    327 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3271478
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2018
    Accepted: 01 June 2018
    Revised: 01 April 2018
    Received: 01 November 2017
    Published in TKDD Volume 12, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Query recommendation
    2. entity
    3. knowledge base

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Research Grants Council of Hong Kong

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)36
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Eliminating Negative Word Similarities for Measuring Document Distances: A Thoroughly Empirical Study on Word Mover’s DistanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.322233635:6(7936-7948)Online publication date: Jun-2024
    • (2024)Multi-level feature interaction for open knowledge base canonicalizationKnowledge-Based Systems10.1016/j.knosys.2024.112386303(112386)Online publication date: Nov-2024
    • (2024)A cooperative co-evolutionary genetic algorithm for query recommendationMultimedia Tools and Applications10.1007/s11042-023-15585-683:4(11461-11491)Online publication date: 1-Jan-2024
    • (2023)SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifoldsProceedings of the ACM Web Conference 202310.1145/3543507.3583353(360-371)Online publication date: 30-Apr-2023
    • (2022)The Social Technology and Research (STAR) Lab in the University of Hong KongACM SIGMOD Record10.1145/3552490.355250851:2(63-68)Online publication date: 29-Jul-2022
    • (2022)Entity disambiguation method based on Graph Attention Networks2022 14th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA)10.1109/ICMTMA54903.2022.00186(912-919)Online publication date: Jan-2022
    • (2021)Heterogeneous Few-Shot Model Rectification With Semantic MappingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2020.299474943:11(3878-3891)Online publication date: 1-Nov-2021
    • (2021)Semantic expansion to improve diversity in query formulation2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)10.1109/LA-CCI48322.2021.9769853(1-6)Online publication date: 2-Nov-2021
    • (2020)Research On Tag Recommendation Based on Multiple Keywords2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS)10.1109/ICITBS49701.2020.00204(921-926)Online publication date: Jan-2020
    • (2020)Review on Performance Of SDAE For Historical Usage Data Using Deep Learning2020 International Conference on Intelligent Engineering and Management (ICIEM)10.1109/ICIEM48762.2020.9160117(23-28)Online publication date: Jun-2020
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media