
2004 SIGMOD Test of Time Award
From Structured Documents to Novel Query Facilities
Vassilis Christophides, Serge Abiteboul, Sophie Cluet, and Michel Scholl
This paper presented several long-lasting contributions, combining
SGML-based document management with database technology:
- It presented a technique to map DTDs to DB schemata and to store
SGML documents in a database in such a way, that the document
structure is preserved and can be used for querying.
- It introduced paths and attributes of SGML as first-class
citizens. This is the most interesting novelty, since it allows
combining the querying of schema information and data in a homogeneous
fashion and thereby navigating through SGML documents based on their
structure as well as their values.
- As a practical consequence, information retrieval based on text
patterns can be generalized to include document structure. This
basically adds semantics to the query facilities of an Information
System.
- The paper laid the formal foundations for query languages for
semistructured data, which later in the form of XPath became the core
of several such query languages.
Although the paper deals primarily with SGML, most of the ideas were
carried over to XML and gained high significance there.
|