Using Schematically Heterogeneous Structures
Renée J. Miller
(Ohio State University)
Schematic heterogeneity arises when information that is represented as
data under one schema, is represented within the schema (as metadata)
in another. Schematic heterogeneity is an important class of
heterogeneity that arises frequently in integrating legacy data in
federated or data warehousing applications. Traditional query
languages and view mechanisms are insufficient for reconciling and
translating data between schematically heterogeneous schemas. Higher
order query languages, that permit quantification over schema labels,
have been proposed to permit querying and restructuring of data
between schematically disparate schemas. We extend this work by
considering how these languages can be used in practice.
Specifically, we consider a restricted class of higher order views and
show the power of these views in integrating legacy structures. Our
results provide insights into the properties of restructuring
transformations required to resolve schematic discrepancies. In
addition, we show how the use of these views permits schema browsing
and new forms of data independence that are important for global
information systems. Furthermore, these views provide a framework for
integrating semi-structured and unstructured queries, such as keyword
searches, into a structured querying environment. We show how these
views can be used with minimal extensions to existing query engines.
We give conditions under which a higher order view is usable for
answering a query and provide query translation algorithms.
The full paper is available from
http://www.cis.ohio-state.edu/~rjmiller.