Mind Your Vocabulary: Query Mapping Across Heterogeneous Information Sources

Chen-Chuan K. Chang*       Hector Garcia-Molina
Stanford University       Stanford University
changcc@cs.stanford.edu       hector@cs.stanford.edu

Abstract

In this paper we present a mechanism for translating constraint queries, i.e., Boolean expressions of constraints, across heterogeneous information sources. Integrating such systems is difficult in part because they use a wide range of (selection or join) constraints as the vocabulary for formulating queries. We describe algorithms that apply user-provided mapping rules to translate query constraints into ones that are understood and supported in another context, e.g., that use the proper operators and value formats. We show that the translated queries minimally subsume the original ones. Unlike other query mapping work, we effectively consider inter-dependencies among constraints, i.e., we handle constraints that cannot be translated independently. Furthermore, when constraints are not fully supported, our framework explore relaxations (semantic rewritings) into the closest supported version. Our most sophisticated algorithm (Algorithm TDQM) does not blindly convert queries to DNF (which would be easier to translate, but expensive); instead it performs a top-down mapping of a query tree, and does local query structure conversion only when necessary. Finally, we extend the framework to also handle join constraints.