Digital Symposium Collection 2000  

 
 
 
 
 
 

 





















Relational Databases for Querying XML Documents: Limitations and Opportunities

Jayavel Shanmugasundaram, Kristin Tufte, Chun Zhang, Gang He, David J. DeWitt, and Jeffrey F. Naughton

  View Paper (PDF)  

Return to Semistructured Data & XML Queries

Abstract
XML is fast emerging as the dominant standard for representing data in the World Wide Web. Sophisticated query engines that allow users to effectively tap the data stored in XML documents will be crucial to exploiting the full power of XML. While there has been a great deal of activity recently proposing new semistructured data models and query languages for this purpose, this paper explores the more conservative approach of using traditional relational database engines for processing XML documents conforming to Document Type Descriptors (DTDs). To this end, we have developed algorithms and implemented a prototype system that converts XML documents to relational tuples, translates semi-structured queries over XML documents to SQL queries over tables, and converts the results to XML. We have qualitatively evaluated this approach using several real DTDs drawn from diverse domains. It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases. We identify the causes for these limitations and propose certain extensions to the relational model that would make it more appropriate for processing queries over XML documents.


References

Note: References link to DBLP on the Web.

[1]
Serge Abiteboul , Dallan Quass , Jason McHugh , Jennifer Widom , Janet L. Wiener : The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1) : 68-88(1997)
[2]
...
[3]
W3C: Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/REC-xml
[4]
...
[5]
Peter Buneman , Susan B. Davidson , Gerd G. Hillebrand , Dan Suciu : A Query Language and Optimization Techniques for Unstructured Data. SIGMOD Conf. 1996 : 505-516
[6]
Vassilis Christophides , Serge Abiteboul , Sophie Cluet , Michel Scholl : From Structured Documents to Novel Query Facilities. SIGMOD Conference 1994 : 313-324
[7]
George P. Copeland , Setrag Khoshafian : A Decomposition Storage Model. SIGMOD Conference 1985 : 268-279
[8]
Robin Cover: The SGML/XML Web Page. http://www.oasis-open.org/cover/xml.html
[9]
Alin Deutsch , Mary F. Fernandez , Daniela Florescu , Alon Y. Levy , Dan Suciu : Xml-ql: A Query Language for XML. http://www.w3.org/TR/NOTE-xml-ql/
[10]
Alin Deutsch , Mary F. Fernandez , Dan Suciu : Storing Semistructured Data with STORED. SIGMOD Conference 1999 : 431-442
[11]
Ronald Fagin : Multivalued Dependencies and a New Normal Form for Relational Databases. TODS 2(3) : 262-278(1977)
[12]
Mary F. Fernandez , Dan Suciu : Optimizing Regular Path Expressions Using Graph Schemas. ICDE 1998 : 14-23
[13]
G. Jaeschke , Hans-Jörg Schek : Remarks on the Algebra of Non First Normal Form Relations. PODS 1982 : 124-138
[14]
Jason McHugh , Serge Abiteboul , Roy Goldman , Dallan Quass , Jennifer Widom : Lore: A Database Management System for Semistructured Data. SIGMOD Record 26(3) : 54-66(1997)
[15]
...
[16]
...
[17]
...
[18]
W3C: The w3c Query Language Workshop, December 1998, Boston, MA, USA. (1998) http://www.w3.org/TandS/QL/QL98/
[19]
...
[20]
Timos K. Sellis : Multiple-Query Optimization. TODS 13(1) : 23-52(1988)
[21]
Carlo Zaniolo : The Database Language GEM. SIGMOD Conference 1983 : 207-218

Referenced by

  1. Michael J. Carey : Review - Relational Databases for Querying XML Documents: Limitations and Opportunities. ACM SIGMOD Digital Review 2 : (2000)
  2. H. V. Jagadish : Review - Relational Databases for Querying XML Documents: Limitations and Opportunities. ACM SIGMOD Digital Review 1 : (1999)

BIBTEX

@inproceedings{DBLP:conf/vldb/ShanmugasundaramGTZDN99,
  author    = {Jayavel Shanmugasundaram and
                Kristin Tufte and
                Chun Zhang and
                Gang He and
                David J. DeWitt and
                Jeffrey F. Naughton},
   editor    = {Malcolm P. Atkinson and
                Maria E. Orlowska and
                Patrick Valduriez and
                Stanley B. Zdonik and
                Michael L. Brodie},
   title     = {Relational Databases for Querying XML Documents: Limitations
                and Opportunities},
   booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
                Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
                UK},
   publisher = {Morgan Kaufmann},
   year      = {1999},
   isbn      = {1-55860-615-5},
   pages     = {302-314},
   crossref  = {DBLP:conf/vldb/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM