ACM SIGMOD Anthology VLDB dblp.uni-trier.de

On Applying Hash Filters to Improving the Execution of Multi-Join Queries.

Ming-Syan Chen, Hui-I Hsiao, Philip S. Yu: On Applying Hash Filters to Improving the Execution of Multi-Join Queries. VLDB J. 6(2): 121-131(1997)
@article{DBLP:journals/vldb/ChenHY97,
  author    = {Ming-Syan Chen and
               Hui-I Hsiao and
               Philip S. Yu},
  title     = {On Applying Hash Filters to Improving the Execution of Multi-Join
               Queries},
  journal   = {VLDB J.},
  volume    = {6},
  number    = {2},
  year      = {1997},
  pages     = {121-131},
  ee        = {db/journals/vldb/ChenHY97.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In this paper, we explore an approach of interleaving a bushy execution tree with hash filters to improve the execution of multi-join queries. Similar to semi-joins in distributed query processing, hash filters can be applied to eliminate non-matching tuples from joining relations before the execution of a join, thus reducing the join cost. Note that hash filters built in different execution stages of a bushy tree can have different costs and effects. The effect of hash filters is evaluated first. Then, an efficient scheme to determine an effective sequence of hash filters for a bushy execution tree is developed, where hash filters are built and applied based on the join sequence specified in the bushy tree so that not only is the reduction effect optimized but also the cost associated is minimized. Various schemes using hash filters are implemented and evaluated via simulation. It is experimentally shown that the application of hash filters is in general a very powerful means to improve th e execution of multi-join queries, and the improvement becomes more prominent as the number of relations in a query increases.

Key Words

Hash filters, Parallel query processing, Bushy trees, Sort-merge joins

Copyright © 1997 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Edward Babb: Implementing a Relational Database by Means of Specialized Hardware. ACM Trans. Database Syst. 4(1): 1-29(1979) BibTeX
[2]
Philip A. Bernstein, Dah-Ming W. Chiu: Using Semi-Joins to Solve Relational Queries. J. ACM 28(1): 25-40(1981) BibTeX
[3]
Dina Bitton, Jim Gray: Disk Shadowing. VLDB 1988: 331-338 BibTeX
[4]
Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990) BibTeX
[5]
Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins. IEEE Trans. Knowl. Data Eng. 7(4): 656-668(1995) BibTeX
[6]
Ming-Syan Chen, Philip S. Yu: Interleaving a Join Sequence with Semijoins in Distributed Query Processing. IEEE Trans. Parallel Distrib. Syst. 3(5): 611-621(1992) BibTeX
[7]
Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Optimization of Parallel Execution for Multi-Join Queries. IEEE Trans. Knowl. Data Eng. 8(3): 416-428(1996) BibTeX
[8]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) BibTeX
[9]
David J. DeWitt, Jim Gray: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6): 85-98(1992) BibTeX
[10]
Danièle Gardy, Claude Puech: On the Effects of Join Operations on Relation Sizes. ACM Trans. Database Syst. 14(4): 574-603(1989) BibTeX
[11]
Wei Hong: Exploiting Inter-Operation Parallelism in XPRS. SIGMOD Conference 1992: 19-28 BibTeX
[12]
Wei Hong, Michael Stonebraker: Optimization of Parallel Query Execution Plans in XPRS. PDIS 1991: 218-225 BibTeX
[13]
Hui-I Hsiao, David J. DeWitt: A Performance Study of Three High Availability Data Replication Strategies. PDIS 1991: 18-28 BibTeX
[14]
Yannis E. Ioannidis, Younkyung Cha Kang: Left-Deep vs. Bushy Trees: An Analysis of Strategy Spaces and its Implications for Query Optimization. SIGMOD Conference 1991: 168-177 BibTeX
[15]
Matthias Jarke, Jürgen Koch: Query Optimization in Database Systems. ACM Comput. Surv. 16(2): 111-152(1984) BibTeX
[16]
...
[17]
Ravi Krishnamurthy, Haran Boral, Carlo Zaniolo: Optimization of Nonrecursive Queries. VLDB 1986: 128-137 BibTeX
[18]
Ming-Ling Lo, Ming-Syan Chen, Chinya V. Ravishankar, Philip S. Yu: On Optimal Processor Allocation to Support Pipelined Hash Joins. SIGMOD Conference 1993: 69-78 BibTeX
[19]
...
[20]
Hongjun Lu, Ming-Chien Shan, Kian-Lee Tan: Optimization of Multi-Way Join Queries for Parallel Execution. VLDB 1991: 549-560 BibTeX
[21]
Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992) BibTeX
[22]
Hamid Pirahesh, C. Mohan, Josephine M. Cheng, T. S. Liu, Patricia G. Selinger: Parallelism in Relational Data Base Systems: Architectural Issues and Design Approaches. DPDS 1990: 4-29 BibTeX
[23]
Nick Roussopoulos, Hyunchul Kang: A Pipeline N-way Join Algorithm Based on the 2-way Semijoin Program. IEEE Trans. Knowl. Data Eng. 3(4): 486-495(1991) BibTeX
[24]
...
[25]
Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121 BibTeX
[26]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 BibTeX
[27]
Michael Stonebraker, Randy H. Katz, David A. Patterson, John K. Ousterhout: The Design of XPRS. VLDB 1988: 318-330 BibTeX
[28]
Arun N. Swami: Optimization of Large Join Queries: Combining Heuristic and Combinatorial Techniques. SIGMOD Conference 1989: 367-376 BibTeX
[29]
Arun N. Swami, Anoop Gupta: Optimization of Large Join Queries. SIGMOD Conference 1988: 8-17 BibTeX
[30]
...
[31]
Patrick Valduriez, Georges Gardarin: Join and Semijoin Algorithms for a Multiprocessor Database Machine. ACM Trans. Database Syst. 9(1): 133-161(1984) BibTeX
[32]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 BibTeX
[33]
S. Bing Yao: Approximating the Number of Accesses in Database Organizations. Commun. ACM 20(4): 260-261(1977) BibTeX
[34]
Philip S. Yu, Ming-Syan Chen, Hans-Ulrich Heiss, Sukho Lee: On Workload Characterization of Relational Database Environments. IEEE Trans. Software Eng. 18(4): 347-355(1992) BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:30 2009