[go: up one dir, main page]

Skip to main content
Log in

Temporal XML: modeling, indexing, and query processing

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time. We study the temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries: LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding. We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index, and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible with real-world requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abiteboul S., Cluet S., Ferran G. and Rousset M.-C. (2002). The Xyleme project. Comput. Netw. 39(3): 225–238

    Article  Google Scholar 

  2. Amagasa, T., Yoshikawa, M., Uemura, S.: A temporal data model for XML documents. In: Proceedings of DEXA Conference, pp. 334–344 (2000)

  3. Bozkaya, T., Ozsoyoglu, M.: Indexing valid time intervals. In: Proceedings of DEXA Conference, pp. 541–550 (1998)

  4. Buneman P., Davidson S., Fan W., Hara C. and Tan W. (2002). Keys for XML. Comput. Netw. 39(5): 473–487

    Article  Google Scholar 

  5. Buneman, P., Khanna, S., Tajima, K., Tan, W.: Archiving scientific data. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 1–12, Madison, USA (2002)

  6. Chawathe, S., Abiteboul, S., Widom, J.: Managing historical semistructured data. In: Theory and Practice of Object Systems, vol. 5(3), pp. 143–162, Wiley, New York (1999)

  7. Chawathe, S., Molina, H.G., Ireland, K., Papakonstantinou, Y., Ullman, J., Widom, J.: The TSIMMIS project: integration of heterogeneous information sources. In: Proeedings of 100th Anniversary Meeting of the Information Processing Society of Japan, pp. 7–18 (1994)

  8. Chien, S., Tsotras, V., Zaniolo, C.: Version management of XML documents. In: Proceedings of the Third International Workshop on the Web and Databases, pp. 75–80, Dallas, TX (2000)

  9. Chien, S., Tsotras, V., Zaniolo, C.: Efficient management of multiversion documents by object referencing. In: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 291–300, Rome, Italy (2001)

  10. Chomicki, J.: Temporal query languages: a survey. In: Proceedings of the 1st International Conference on Temporal Logic, LNAI 827, pp. 506–534 (1994)

  11. Chung, C.-W., Min, J.-K., Shim, K.: APEX: An adaptive path index for XML data. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 121–132 (2002)

  12. Clifford J., Dyreson C.E., Isakowitz T., Jensen C.S. and Snodgrass R.T. (1997). On the semantics of “now” in databases. ACM Trans. Datab. Syst. 22(2): 171–214

    Article  Google Scholar 

  13. Consens, M.P., Milo, T.: Optimizing queries on files. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, pp. 301–312 (1994)

  14. De Capitani, S.: An authorization model for temporal XML documents. In: Proceedings of SAC’02, pp. 1088–1093, Madrid, Spain (2002)

  15. Drukh, N., Polyzotis, N., Garofalakis, M.N., Matias, Y.: Fractional XSKETCH synopses for XML databases. In: Proceedings of Second International XML Database Symposium, XSym 2004, pp. 189–203 (2004)

  16. Dyreson C. and Snodgrass R. (1998). Supporting valid-time indeterminacy. ACM Trans. Datab. Syst. 23(1): 1–57

    Article  Google Scholar 

  17. Dyreson, C.E.: Observing transaction-time semantics with TTXPath. In: Proceedings of WISE 2001, pp. 193–202 (2001)

  18. Dyreson, C.E., Bolen, M.H., Jensen, C.S.: Capturing and querying multiple aspects of semistructured data. In: Proceedings of the 25th VLDB Conference, pp. 290–301 (1999)

  19. Etzion, O., Jajodia, S., Sripada, S. (eds): Temporal Databases: Research and Practice. In: LNCS 1399. Springer, Heidelberg (1998)

  20. Fan W. and Siméon J. (2003). Integrity constraints for XML. J. Comput. Syst. Sci. 66(1): 254–291

    Article  MATH  Google Scholar 

  21. Florescu D. and Kossmann D. (1999). Storing and querying XML data using a RDBMS. IEEE Data Eng. Bull. 22(3): 27–34

    Google Scholar 

  22. Gao, C., Snodgrass, R.: Syntax, semantics and query evaluation in the τXQuery temporal XML query language. Time Center Technical Report TR-72 (2003)

  23. Gao, C., Snodgrass, R.: Temporal slicing in the evaluation of XML queries. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 632–643, Berlin, Germany (2003)

  24. Gergatsoulis, M., Stavrakas, Y.: Representing changes in XML documents using dimensions. In: Proceedings of the First Symposium on XML databases (XSym 2003), pp. 208–222, Berlin, Germany (2003)

  25. Goldman, R., Widom, J.: Dataguides: enabling query formulation and optimization in semistructured databases. In: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997)

  26. Grandi F. (2004). Introducing an annotated bibliography on temporal and evolution aspects in the world wide web. SIGMOD Rec. 33(2): 4–86

    Article  Google Scholar 

  27. Grandi, F., Mandreoli, F.: The valid web: an XML/XSL infrastructure for temporal management of web documents. In: Proceedings of the International Conference on Advances in Information Systems, pp. 294–303 (2000)

  28. Grandi, F., Mandreoli, F.: Effective representation and efficient management of indeterminate dates. In: TIME’01, pp. 164–169 (2001)

  29. He, H., Yang, J.: Multiresolution indexing of XML for frequent queries. In: Proceedings of the 20th International Conference on Data Engineering, pp. 683–694 (2004)

  30. Kaplan, H., Milo, T., Shabo, R.: A comparison of labeling schemes for ancestor queries. In: Proceedings of the thirteenth annual ACM-SIAM Symposium on Disete Algorithms, pp. 954–963 (2002)

  31. Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 133–144 (2002)

  32. Kaushik, R., Bohannon, P., Naughton, J.F., Shenoy, P.: Updates for structure indexes. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 239–250 (2002)

  33. Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: Proceedings of the 18th International Conference on Data Engineering, pp. 129–140 (2002)

  34. Liefke, H., Suciu, D.: XMILL: an efficient compressor for XML data. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 153–164 (2000)

  35. Manukyan, M.G., Kalinichenko, L.A.: Temporal XML. In: Proceedings of ADBIS, pp. 581–590, Vilnius, Lithuania (2001)

  36. Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: Proceedings of the 27th VLDB Conference, pp. 581–590, Rome, Italy (2001)

  37. Mendelzon, A.O., Rizzolo, F., Vaisman, A.: Indexing temporal XML documents. In: Proceedings of the 30th International Conference on Very Large Databases, pp. 216–227, Toronto, Canada (2004)

  38. Milo, T., Suciu, D.: Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory, pp. 277–295 (1999)

  39. Nestorov, S., Ullman, J.D., Wiener, J.L., Chawathe, S.S.: Representative objects: concise representations of semistructured, data. In: Proceedings of the 13th International Conference on Data Engineering, pp. 79–90 (1997)

  40. Oliboni, B., Quintarelli, E., Tanca, L.: Temporal aspects of semistructured data. In: Proceedings of the Eight International Symposium of Temporal Representation and Reasoning, pp. 119–127 (2001)

  41. Polyzotis, N., Garofalakis, M.N.: Statistical synopses for graph-structured XML databases. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 358–369 (2002)

  42. Polyzotis, N., Garofalakis, M.N.: Structure and value synopses for XML data graphs. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 466–477 (2002)

  43. Polyzotis, N., Garofalakis, M.N.: XCLUSTER synopses for structured XML content. In: Proceedings of the 22nd International Conference on Data Engineering (2006)

  44. Polyzotis, N., Garofalakis, M.N., Ioannidis, Y.E.: Approximate XML query answers. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 263–274 (2004)

  45. Qun, C., Lim, A., Ong, K.W.: D(k)-index: an adaptive structural summary for graph-structured data. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 134–144 (2003)

  46. Rizzolo, F., Mendelzon, A.O.: Indexing XML data with ToXin. In: Proceedings of 4th International Workshop on the Web and Databases, pp. 49–54 (2001)

  47. Salzberg B. and Tsotras V. (1999). Comparison of access methods for time-evolving data. ACM Comput. Surv. 31(2): 158–221

    Article  Google Scholar 

  48. Santoro N. and Khatib R. (1985). Labelling and implicit routing in networks. Comput. J. 28(1): 5–8

    Article  MATH  MathSciNet  Google Scholar 

  49. Schenkel, R., Theobald, A., Weikum, G.: HOPI: an efficient connection index for complex XML document collections. In: Proceedings of the 9th Conference on Extending Database pp. 237–255 (2004)

  50. Sleepycat Software: Berkeley DB Java Edition (2006). http://www.sleepycat.com/products/bdbje.html

  51. Snodgrass R. (1995). The TSQL2 Temporal Query Language. Kluwer Academic Publishers, Dordnecht

    MATH  Google Scholar 

  52. Tansel, A., Clifford, J., Gadia, S. (eds.): Temporal Databases: Theory, Design and Implementation. Benjamin/Cummings, Reading (1993)

  53. Tatarinov, I., Ives, G., Halevy, A., Weld, D.: Updating XML. In: Proceedings of ACM SIGMOD Conference, pp. 413–424, Santa Barbara, California (2001)

  54. Wadler, P.: A formal semantics of patterns in XSLT. In: Markup Technologies, pp. 183–202, IEEE Computer Society, Philadelphia (1999)

  55. Wang, F., Zaniolo, C.: Temporal queries in XML document archives and web warehouses. In: Proceedings of the 10th International Symposium on Temporal Representation and Reasoning (TIME’03), pp. 47–55, Cairns, Australia (2003)

  56. Wang, F., Zaniolo, C.: XBiT: an XML-based bitemporal data model. In: Proceedings of the 23rd International Conference on Conceptual Modeling, pp. 810–824, Shanghai, China (2004)

  57. Wang, F., Zhou, X., Zaniolo, C.: Efficient XML-based techniques for archiving, querying and publishing the histories of relational databases. In: Time Center TeEchnical Report (2005)

  58. Wang, F., Zhou, X., Zaniolo, C.: Temporal XML? SQL strikes back! In: Proceedings of the 12th International Symposium on Temporal Representation and Reasoning (TIME’05), pp. 47–55, Burlington, USA (2005)

  59. World Wide Web Consortium.: XQuery 1.0: An XML Query Language (2002). http://www.w3.org/TR/2002/WD-xquery-20021115

  60. World Wide Web Consortium.: XML Path Language XPath 2.0 (2003). http://www.w3.org/TR/2003/WD-xpath20-20030502

  61. Yi, K., He, H., Stanoi, I., Yang, J.: Inemental maintenance of XML structural indexes. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Flavio Rizzolo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rizzolo, F., Vaisman, A.A. Temporal XML: modeling, indexing, and query processing. The VLDB Journal 17, 1179–1212 (2008). https://doi.org/10.1007/s00778-007-0058-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-007-0058-x

Keywords

Navigation