|




















|
|
 |
|
 |
An Adaptive Query Execution System for Data Integration
|
Zachary G. Ives,
Daniela Florescu,
Marc Friedman,
Alon Y. Levy, and
Daniel S. Weld
View Paper (PDF)
Return to Adaptive Query Optimization
Query processing in data integration occurs over network- bound, autonomous data sources. This requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the data, data transfer rates are unpredictable and bursty, and slow or unavailable data sources can often be replaced by overlapping or mirrored sources. This paper presents the Tukwila data integration system, designed to support adap- tivity at its core using a two-pronged approach. Interleaved planning and execution with partial optimization allows Tuk- wila to quickly recover from decisions based on inaccurate estimates. During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which pro- duces answers quickly, and the dynamic collector, which ro- bustly and efficiently computes unions across overlapping data sources. We demonstrate that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and we present experimental evidence that our tech- niques result in behavior desirable for a data integration system.
Note: References link to DBLP on the Web.
-
[1]
-
Sibel Adali
,
K. Selçuk Candan
,
Yannis Papakonstantinou
,
V. S. Subrahmanian
: Query Caching and Optimization in Distributed Mediator Systems.
SIGMOD Conf. 1996
: 137-148
-
[2]
-
Gennady Antoshenkov
,
Mohamed Ziauddin
: Query Processing and Optimization in Oracle Rdb.
VLDB Journal 5(4)
: 229-237(1996)
-
[3]
-
Yigal Arens
,
Craig A. Knoblock
,
Wei-Min Shen
: Query Reformulation for Dynamic Information Integration.
JIIS 6(2/3)
: 99-130(1996)
-
[4]
-
José A. Blakeley
: Data Access for the Masses through OLE DB.
SIGMOD Conf. 1996
: 161-172
-
[5]
-
Luc Bouganim
,
Olga Kapitskaia
,
Patrick Valduriez
: Memory-Adaptive Scheduling for Large Query Execution.
CIKM 1998
: 105-115
-
[6]
-
William W. Cohen
: Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity.
SIGMOD Conference 1998
: 201-212
-
[7]
-
Cem Evrendilek
,
Asuman Dogac
,
Sena Nural
,
Fatma Ozcan
: Multidatabase Query Optimization.
Distributed and Parallel Databases 5(1)
: 77-114(1997)
-
[8]
-
Daniela Florescu
,
Daphne Koller
,
Alon Y. Levy
: Using Probabilistic Information in Data Integration.
VLDB 1997
: 216-225
-
[9]
-
Marc Friedman
,
Daniel S. Weld
: Efficiently Executing Information-Gathering Plans.
IJCAI (1) 1997
: 785-791
-
[10]
-
Hector Garcia-Molina
,
Yannis Papakonstantinou
,
Dallan Quass
,
Anand Rajaraman
,
Yehoshua Sagiv
,
Jeffrey D. Ullman
,
Vasilis Vassalos
,
Jennifer Widom
: The TSIMMIS Approach to Mediation: Data Models and Languages.
JIIS 8(2)
: 117-132(1997)
-
[11]
-
Goetz Graefe
: Query Evaluation Techniques for Large Databases.
Computing Surveys 25(2)
: 73-170(1993)
-
[12]
-
Richard L. Cole
,
Goetz Graefe
: Optimization of Dynamic Query Evaluation Plans.
SIGMOD Conference 1994
: 150-160
-
[13]
-
Laura M. Haas
,
Donald Kossmann
,
Edward L. Wimmers
,
Jun Yang
: Optimizing Queries Across Diverse Data Sources.
VLDB 1997
: 276-285
-
[14]
-
Wei Hong
,
Michael Stonebraker
: Optimization of Parallel Query Execution Plans in XPRS.
Distributed and Parallel Databases 1(1)
: 9-32(1993)
-
[15]
-
Navin Kabra
,
David J. DeWitt
: Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans.
SIGMOD Conference 1998
: 106-117
-
[16]
-
Nicholas Kushmerick
,
Daniel S. Weld
,
Robert B. Doorenbos
: Wrapper Induction for Information Extraction.
IJCAI (1) 1997
: 729-737
-
[17]
-
Alon Y. Levy
,
Anand Rajaraman
,
Joann J. Ordille
: Querying Heterogeneous Information Sources Using Source Descriptions.
VLDB 1996
: 251-262
-
[18]
-
Biswadeep Nag
,
David J. DeWitt
: Memory Allocation Strategies for Complex Decision Support Queries.
CIKM 1998
: 116-123
-
[19]
-
...
-
[20]
-
Michael Stonebraker
,
Paul M. Aoki
,
Witold Litwin
,
Avi Pfeffer
,
Adam Sah
,
Jeff Sidell
,
Carl Staelin
,
Andrew Yu
: Mariposa: A Wide-Area Distributed Database System.
VLDB Journal 5(1)
: 48-63(1996)
-
[21]
-
Anthony Tomasic
,
Louiqa Raschid
,
Patrick Valduriez
: Scaling Access to Heterogeneous Data Sources with DISCO.
TKDE 10(5)
: 808-823(1998)
-
[22]
-
Tolga Urhan
,
Michael J. Franklin
,
Laurent Amsaleg
: Cost Based Query Scrambling for Initial Delays.
SIGMOD Conference 1998
: 130-141
-
[23]
-
Shivakumar Venkataraman
,
Tian Zhang
: Heterogeneous Database Query Optimization in DB2 Universal DataJoiner.
VLDB 1998
: 685-689
-
[24]
-
Annita N. Wilschut
,
Peter M. G. Apers
: Dataflow Query Execution in a Parallel Main-Memory Environment.
PDIS 1991
: 68-77
-
[25]
-
Darrell Woelk
,
Bill Bohrer
,
Nigel Jacobs
,
K. Ong
,
Christine Tomlinson
,
C. Unnikrishnan
: Carnot and InfoSleuth: Database Technology and the World Wide Web.
SIGMOD Conference 1995
: 443-444
-
[26]
-
Eugene Wong
,
Karel Youssefi
: Decomposition - A Strategy for Query Processing.
TODS 1(3)
: 223-241(1976)
-
[27]
-
Ramana Yerneni
,
Yannis Papakonstantinou
,
Serge Abiteboul
,
Hector Garcia-Molina
: Fusion Queries over Internet Databases.
EDBT 1998
: 57-71
@inproceedings{DBLP:conf/sigmod/IvesFFLW99,
author = {Zachary G. Ives and
Daniela Florescu and
Marc Friedman and
Alon Y. Levy and
Daniel S. Weld},
editor = {Alex Delis and
Christos Faloutsos and
Shahram Ghandeharizadeh},
title = {An Adaptive Query Execution System for Data Integration},
booktitle = {SIGMOD 1999, Proceedings ACM SIGMOD International Conference
on Management of Data, June 1-3, 1999, Philadephia, Pennsylvania,
USA},
publisher = {ACM Press},
year = {1999},
isbn = {1-58113-084-8},
pages = {299-310},
crossref = {DBLP:conf/sigmod/99},
bibsource = {DBLP, http://dblp.uni-trier.de} } },
Copyright(C) 2000 ACM
|
|
|
|
|
|
|