ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Considering Data Skew Factor in Multi-Way Join Query Optimization for Parallel Execution.

Kien A. Hua, Yu-lung Lo, Honesty C. Young: Considering Data Skew Factor in Multi-Way Join Query Optimization for Parallel Execution. VLDB J. 2(3): 303-330(1993)
@article{DBLP:journals/vldb/HuaLY93,
  author    = {Kien A. Hua and
               Yu-lung Lo and
               Honesty C. Young},
  title     = {Considering Data Skew Factor in Multi-Way Join Query Optimization
               for Parallel Execution},
  journal   = {VLDB J.},
  volume    = {2},
  number    = {3},
  year      = {1993},
  pages     = {303-330},
  ee        = {db/journals/vldb/HuaLY93.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

A consensus on parallel architecture for very large database management has emerged. This architecture is based on a shared-nothing hardware organization. The computation model is very sensitive to skew in tuple distribution, however. Recently, several parallel join algorithms with dynamic load balancing capabilities have been proposed to address this issue, but none of them consider multi-way join problems. In this article we propose a dynamic load balancing technique for multi-way joins, and investigate the effect of load balancing on query optimization. In particular, we present a join-ordering strategy that takes load-balancing issues into consideration. Our performance study indicates that the proposed query optimization technique can provide very impressive performance improvements over conventional approaches.

Copyright © 1993 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Key Words

Parallel-database computer, load balancing, multi-way join, query optimization.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[Boral et al. 1990]
Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990) BibTeX
[Chen et al. 1992]
Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins. VLDB 1992: 15-26 BibTeX
[DeWitt et al. 1990]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) BibTeX
[DeWitt et al. 1984]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[DeWitt et al. 1992]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 BibTeX
[Englert et al. 1990]
Susanne Englert, Jim Gray, Terrye Kocher, Praful Shah: A Benchmark of NonStop SQL Release 2 Demonstrating Near-Linear Speedup and Scaleup on Large Databases. SIGMETRICS 1990: 245-246 BibTeX
[Hong & Stonebraker 1991]
Wei Hong, Michael Stonebraker: Optimization of Parallel Query Execution Plans in XPRS. PDIS 1991: 218-225 BibTeX
[Hua & Lee 1990]
Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506 BibTeX
[Hua & Lee 1991]
Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535 BibTeX
[Hua et al. 1993]
Kien A. Hua, Yu-lung Lo, Honesty C. Young: Including the Load Balancing Issue in the Optimization of Multi-way Join Queries for Shared-Nothing Database Computers. PDIS 1993: 74-83 BibTeX
[Hua & Young 1990]
...
[Jordan 1978]
...
[Kitsuregawa & Ogawa 1990]
Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221 BibTeX
[Kitsuregawa et al. 1983]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[Lakshmi & Yu 1988]
M. Seetha Lakshmi, Philip S. Yu: Effect of Skew on Join Performance in Parallel Architectures. DPDS 1988: 107-120 BibTeX
[Lakshmi & Yu 1989]
M. Seetha Lakshmi, Philip S. Yu: Limiting Factors of Join Performance on Parallel Processors. ICDE 1989: 488-496 BibTeX
[Lorie et al. 1991]
...
[Lu et al. 1991]
Hongjun Lu, Ming-Chien Shan, Kian-Lee Tan: Optimization of Multi-Way Join Queries for Parallel Execution. VLDB 1991: 549-560 BibTeX
[Sacco 1986]
Giovanni Maria Sacco: Fragmentation: A Technique for Efficient Query Processing. ACM Trans. Database Syst. 11(2): 113-133(1986) BibTeX
[Schneider & DeWitt 1989]
Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121 BibTeX
[Schneider & DeWitt 1990]
Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480 BibTeX
[Selinger et al. 1979]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 BibTeX
[Stonebraker 1986]
Michael Stonebraker: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1): 4-9(1986) BibTeX
[Stonebraker 1988]
Michael Stonebraker, Randy H. Katz, David A. Patterson, John K. Ousterhout: The Design of XPRS. VLDB 1988: 318-330 BibTeX
[Su 1983]
...
[Swami et al. 1992]
...
[Tan & Lu 1992]
...
[Teradata Corporation 1988]
...
[Turbyfill 1987]
...
[Walton et al. 1991]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 BibTeX
[Wolf et al. 1990]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu: An Effective Algorithm for Parallelizing Sort Merge in the Presence of Data Skew. DPDS 1990: 103-115 BibTeX
[Wolf et al. 1991a]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209 BibTeX
[Wolf et al. 1991b]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: Comparative Performance of Parallel Join Algorithms. PDIS 1991: 78-88 BibTeX
[Wong & Youssefi 1976]
Eugene Wong, Karel Youssefi: Decomposition - A Strategy for Query Processing. ACM Trans. Database Syst. 1(3): 223-241(1976) BibTeX
[Zipf 1949]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
BibTeX

Referenced by

  1. Lionel Brunie, Harald Kosch: ModParOpt: A Modular Query Optimizer for Multi-Query Parallel Databases. ADBIS 1997: 97-106
  2. Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Optimization of Parallel Execution for Multi-Join Queries. IEEE Trans. Knowl. Data Eng. 8(3): 416-428(1996)
  3. Kien A. Hua, Chiang Lee, Chau M. Hua: Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning. IEEE Trans. Knowl. Data Eng. 7(6): 968-983(1995)
  4. Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins. IEEE Trans. Knowl. Data Eng. 7(4): 656-668(1995)
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:18 2009