[go: up one dir, main page]

skip to main content
10.1145/3299869.3319890acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Answering Why-questions by Exemplars in Attributed Graphs

Published: 25 June 2019 Publication History

Abstract

This paper studies the problem of \em answering Why-questions for graph pattern queries. Given a query Q, its answers $Q(G)$ in a graph G, and an exemplar $\E$ that describes desired answers, it aims to compute a query rewrite $Q'$, such that $Q'(G)$ incorporates relevant entities and excludes irrelevant ones wrt $\E$ under a closeness measure. (1) We characterize the problem by \em Q-Chase. It rewrites Q by applying a sequence of applicable operators guided by $\E$, and backtracks to derive optimal query rewrite. (2) We develop feasible Q-Chase-based algorithms, from anytime solutions to fixed-parameter approximations to compute query rewrites. These algorithms implement Q-Chase by detecting picky operators at run time, which discriminately enforce $\E$ to retain answers that are closer to exemplars, and effectively prune both operators and irrelevant matches, by consulting a cache of star patterns (called \em star views ). Using real-world graphs, we experimentally verify the efficiency and effectiveness of \qchase techniques and their applications.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases .Addison-Wesley.
[2]
Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida. 2013. Fast exact shortest-path distance queries on large networks. In SIGMOD .
[3]
Günecs Alucc, Olaf Hartig, M Tamer Özsu, and Khuzaima Daudjee. 2014. Diversified stress testing of RDF data management systems. In ISWC .
[4]
Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and where: A characterization of data provenance. In ICDT .
[5]
Wenfei Fan, Jianzhong Li, Shuai Ma, Hongzhi Wang, and Yinghui Wu. 2010. Graph homomorphism revisited for graph matching. VLDB (2010), 1161--1172.
[6]
Wenfei Fan, Xin Wang, and Yinghui Wu. 2013. Diversified top-k graph pattern matching. VLDB (2013), 1510--1521.
[7]
Wenfei Fan, Yinghui Wu, and Jingbo Xu. 2016. Functional dependencies for graphs. In SIGMOD. 1843--1857.
[8]
Mario Arias Gallego, Javier D Fernández, Miguel A Mart'inez-Prieto, and Pablo de la Fuente. 2011. An empirical study of real-world SPARQL queries. USEWOD workshop .
[9]
Xinbo Gao, Bing Xiao, Dacheng Tao, and Xuelong Li. 2010. A survey of graph edit distance. Pattern Analysis and applications (2010), 113--129.
[10]
Michael Garey and David Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness .
[11]
Md Saiful Islam, Chengfei Liu, and Jianxin Li. 2015. Efficient answering of why-not questions in similar graph matching. TKDE (2015).
[12]
Md Saiful Islam, Chengfei Liu, and Rui Zhou. 2012. User feedback based query refinement by exploiting skyline operator. In ER .
[13]
Nandish Jayaram, Sidharth Goyal, and Chengkai Li. 2015a. VIIQ: auto-suggestion enabled visual interface for interactive graph query formulation. VLDB (2015), 1940--1943.
[14]
Nandish Jayaram, Arijit Khan, Chengkai Li, Xifeng Yan, and Ramez Elmasri. 2015b. Querying knowledge graphs by example entity tuples. TKDE (2015), 2797--2811.
[15]
Gjergji Kasneci, Fabian M Suchanek, Georgiana Ifrim, Maya Ramanath, and Gerhard Weikum. 2008. Naga: Searching and ranking knowledge. In ICDE .
[16]
Samir Khuller, Anna Moss, and Joseph Seffi Naor. 1999. The budgeted maximum coverage problem. Inform. Process. Lett., Vol. 70, 1 (1999), 39--45.
[17]
Phokion G Kolaitis and Madhukar N Thakur. 1994. Logical definability of NP optimization problems. Information and Computation (1994).
[18]
Dániel Marx. 2008. Parameterized complexity and approximation algorithms. Comput. J., Vol. 51, 1 (2008), 60--78.
[19]
Alexandra Meliou, Wolfgang Gatterbauer, Katherine F Moore, and Dan Suciu. 2009. Why so? or why no? functional causality for explaining query answers. arXiv preprint arXiv:0912.5340 (2009).
[20]
Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL benchmark--performance assessment with real queries on real data. In ISWC .
[21]
Davide Mottin, Francesco Bonchi, and Francesco Gullo. 2015. Graph query reformulation with diversity. In KDD .
[22]
Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, and Themis Palpanas. 2014. Exemplar queries: Give me an example of what you need. VLDB (2014), 365--376.
[23]
Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, and Themis Palpanas. 2016. Exemplar queries: a new way of searching. VLDB (2016), 741--765.
[24]
Mohammad Hossein Namaki, Yinghui Wu, Qi Song, Peng Lin, and Tingjian Ge. 2017. Discovering Graph Temporal Association Rules. In CIKM. 1697--1706.
[25]
Mohammad Hossein Namaki, Yinghui Wu, and Xin Zhang. 2018. GExp: Cost-aware Graph Exploration with Keywords. SIGMOD .
[26]
Alexandra Poulovassilis. 2018. Applications of Flexible Querying to Graph Data. Graph Data Management . 97--142.
[27]
L Rocach and O Maimon. 2005. Clustering methods Data mining and knowledge discovery handbook. Springer US (2005), 321.
[28]
Qi Song, Mohammad Hossein Namaki, and Yinghui Wu. 2019. Answering Why-Questions for Subgraph Queries in Multi-Attributed Graphs. In ICDE .
[29]
Yinglong Song, Huey Eng Chua, Sourav S Bhowmick, Byron Choi, and Shuigeng Zhou. 2018. BOOMER: Blending Visual Formulation and Processing of P-Homomorphic Queries on Large Networks. In SIGMOD .
[30]
Wen Sun, Achille Fokoue, Kavitha Srinivas, Anastasios Kementsietsidis, Gang Hu, and Guotong Xie. 2015. Sqlgraph: An efficient relational-based property graph store. In SIGMOD .
[31]
Quoc Trung Tran and Chee-Yong Chan. 2010. How to conquer why-not questions. In SIGMOD .
[32]
Elena Vasilyeva, Maik Thiele, Christof Bornhövd, and Wolfgang Lehner. 2016. Answering “why empty” and “why so many?” queries in graph databases. J. Comput. System Sci., Vol. 82, 1 (2016).
[33]
Elena Vasilyeva, Maik Thiele, Adrian Mocan, and Wolfgang Lehner. 2015. Relaxation of subgraph queries delivering empty results. In SSDBM .
[34]
Meng Wang, Jun Liu, Bifan Wei, Siyu Yao, Hongwei Zeng, and Lei Shi. 2018. Answering why-not questions on SPARQL queries. KAIS (2018).
[35]
Mohamed Yahya, Klaus Berberich, Maya Ramanath, and Gerhard Weikum. 2016. Exploratory querying of extended knowledge graphs. VLDB (2016), 1521--1524.
[36]
Shengqi Yang, Fangqiu Han, Yinghui Wu, and Xifeng Yan. 2016. Fast top-k search in knowledge graphs. In ICDE .
[37]
Shengqi Yang, Yinghui Wu, Huan Sun, and Xifeng Yan. 2014. Schemaless and structureless graph querying. VLDB (2014), 565--576.
[38]
Lei Zou, M Tamer Özsu, Lei Chen, Xuchuan Shen, Ruizhe Huang, and Dongyan Zhao. 2014. gStore: a graph-based SPARQL query engine. VLDBJ, Vol. 23, 4 (2014), 565--590.

Cited By

View all
  • (2024)Exploring Optimal Parameters for Expected Results on Radius-Bounded k-Core Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00183(2312-2324)Online publication date: 13-May-2024
  • (2023)Efficient Complex Aggregate Queries with Accuracy Guarantee Based on Execution Cost Model over Knowledge GraphsMathematics10.3390/math1118390811:18(3908)Online publication date: 14-Sep-2023
  • (2022)Finding Multidimensional Constraint Reachable Paths for Attributed GraphsICST Transactions on Scalable Information Systems10.4108/eetsis.v9i4.2581(e2)Online publication date: 22-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data
June 2019
2106 pages
ISBN:9781450356435
DOI:10.1145/3299869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. approximate algorithms
  2. data provenance
  3. property graphs
  4. query by examples
  5. query suggestion
  6. why-questions

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '19
Sponsor:
SIGMOD/PODS '19: International Conference on Management of Data
June 30 - July 5, 2019
Amsterdam, Netherlands

Acceptance Rates

SIGMOD '19 Paper Acceptance Rate 88 of 430 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)105
  • Downloads (Last 6 weeks)17
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring Optimal Parameters for Expected Results on Radius-Bounded k-Core Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00183(2312-2324)Online publication date: 13-May-2024
  • (2023)Efficient Complex Aggregate Queries with Accuracy Guarantee Based on Execution Cost Model over Knowledge GraphsMathematics10.3390/math1118390811:18(3908)Online publication date: 14-Sep-2023
  • (2022)Finding Multidimensional Constraint Reachable Paths for Attributed GraphsICST Transactions on Scalable Information Systems10.4108/eetsis.v9i4.2581(e2)Online publication date: 22-Aug-2022
  • (2022)CRUX: Crowdsourced Materials Science Resource and Workflow ExplorationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557194(5014-5018)Online publication date: 17-Oct-2022
  • (2022)Diversified Subgraph Query Generation with Group FairnessProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498525(686-694)Online publication date: 11-Feb-2022
  • (2022)Answering Why-Questions for Subgraph QueriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.304643634:10(4636-4649)Online publication date: 1-Oct-2022
  • (2022)Subgraph Query Generation with Fairness and Diversity Constraints2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00278(3106-3118)Online publication date: May-2022
  • (2022)An Execution Cost Model for Aggregate Queries over Knowledge Graphs2021 Ninth International Conference on Advanced Cloud and Big Data (CBD)10.1109/CBD54617.2021.00028(113-120)Online publication date: Mar-2022
  • (2022)Top-k star queries on knowledge graphs through semantic-aware bounding match scoresKnowledge-Based Systems10.1016/j.knosys.2020.106655213:COnline publication date: 23-Apr-2022
  • (2022)Social data provenance framework based on zero-information loss graph databaseSocial Network Analysis and Mining10.1007/s13278-022-00889-612:1Online publication date: 3-Jul-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media