Graph Patterns: Structure, Query Answering and Applications in Schema Mappings and Formal Language Theory
View/ Open
Date
2013Author
Reutter, Juan L.
Metadata
Abstract
Graph data appears in a variety of application domains, and many uses of it, such as querying,
matching, and transforming data, naturally result in incompletely specified graph data, i.e.,
graph patterns. Queries need to be posed against such data, but techniques for querying patterns
are generally lacking, and even simple properties of graph patterns, such as the languages
needed to specify them, are not well understood.
In this dissertation we present several contributions in the study of graph patterns. We
analyze how to query them and how to use them as queries. We also analyze some of their
applications in two different contexts: schema mapping specification and data exchange for
graph databases, and formal language theory. We first identify key features of patterns, such as
node and label variables and edges specified by regular expressions, and define a classification
of patterns based on them. Next we study how to answer standard graph queries over graph patterns,
and give precise characterizations of both data and combined complexity for each class
of patterns. If complexity is high, we do further analysis of features that lead to intractability, as
well as lower-complexity restrictions that guarantee tractability. We then turn to the the study
of schema mappings for graph databases. As for relational and XML databases, our mapping
languages are based on patterns. They subsume all previously considered mapping languages
for graph databases, and are capable of expressing many data exchange scenarios in the graph
database context. We study the problems of materializing solutions and query answering for
data exchange under these mappings, analyze their complexity, and identify relevant classes of
mappings and queries for which these problems can be solved efficiently. We also introduce a
new model of automata that is based on graph patterns, and define two modes of acceptance
for them. We show that this model has applications not only in graph databases but in several
other contexts. We study the basic properties of such automata, and the key computational
tasks associated with them.