Your Mediators Need Data Conversion!
Sophie Cluet (Institut National de Recherche en Informatique et en Automatique, France)
Claude Delobel (Laboratoire de Recherche en Informatique, Orsay, France)
Jérôme Siméon (Institut National de Recherche en Informatique et en Automatique, France)
Katarzyna Smaga (Institut National de Recherche en Informatique et en Automatique, France)

Due to the development of the World Wide Web, the integration of heterogeneous data sources has become a major concern of the database community. Appropriate architectures and query languages have been proposed. Yet, the problem of data conversion which is essential for the development of mediators/wrappers architectures has remained largely unexplored.

In this paper, we present the YAT system for data conversion. This system provides tools for the specification and the implementation of data conversions among heterogeneous data sources. It relies on a middleware model, a declarative language, a customization mechanism and a graphical interface.

The model is based on named trees with ordered and labeled nodes. Like semistructured data models, it is simple enough to facilitate the representation of any data. Its main originality is that it allows to reason at various levels of representation. The YAT conversion language (called YATL) is declarative, rule-based and features enhanced pattern matching facilities and powerful restructuring primitives. It allows to preserve or reconstruct the order of collections. The customization mechanism relies on program instantiations: an existing program may be instantiated into a more specific one, and then easily modified. We also present the architecture, implementation and practical use of the YAT prototype, currently under evaluation within the OPAL project.