ACM SIGMOD Anthology VLDB dblp.uni-trier.de

The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method.

Masaru Kitsuregawa, Masaya Nakayama, Mikio Takagi: The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method. VLDB 1989: 257-266
@inproceedings{DBLP:conf/vldb/KitsuregawaNT89,
  author    = {Masaru Kitsuregawa and
               Masaya Nakayama and
               Mikio Takagi},
  editor    = {Peter M. G. Apers and
               Gio Wiederhold},
  title     = {The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE
               Hash Join Method},
  booktitle = {Proceedings of the Fifteenth International Conference on Very
               Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands},
  publisher = {Morgan Kaufmann},
  year      = {1989},
  isbn      = {1-55860-101-5},
  pages     = {257-266},
  ee        = {db/conf/vldb/KitsuregawaNT89.html},
  crossref  = {DBLP:conf/vldb/89},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In this paper, we show detailed analysis and performance evaluation of the Dynamic Hybrid GRACE Hash Join Method (DHGH Method) when the tuple distribution in buckets is unbalanced.

The conventional Hash Join Methods specify the tuple distribution in buckets statically. However it may differ from estimation since join operations are appliedwith selection operations. When the tuple distribution in buckets is unbalanced, the processing cost of join operation becomes more costly than the ideal case when you use Hybrid Hash Join Method (HH Method). On the other hand, when you use the DHGH Method, the destaging buckets are selected dynamically, gives the same performance as the ideal case even if the tuple distribution in buckets is unbalanced such as Zipf-like distributions.

We analyze the total I/O cost of a join operation at various number of buckets. The result shows that we have to determine the number of buckets based on the tuple distribution in buckets rather than the size of the sourcerelation. It is shown that we had better partition the source relation using a large number of small buckets instead of the smaller number of buckets almost filling the whole main memory adopted in the HH Method.

Copyright © 1989 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Peter M. G. Apers, Gio Wiederhold (Eds.): Proceedings of the Fifteenth International Conference on Very Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands. Morgan Kaufmann 1989, ISBN 1-55860-101-5
BibTeX

References

[Bra84]
Kjell Bratbergsengen: Hashing Methods and Relational Algebra Operations. VLDB 1984: 323-333 BibTeX
[DeW84]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[DeW85]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 BibTeX
[Ger86]
...
[Kit83]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[Knu73]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[Nak88]
Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478 BibTeX
[Sha86]
Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. 11(3): 239-264(1986) BibTeX
[Sto76]
Michael Stonebraker, Eugene Wong, Peter Kreps, Gerald Held: The Design and Implementation of INGRES. ACM Trans. Database Syst. 1(3): 189-222(1976) BibTeX
[Tur87]
...
[Tur88]
...
[Yam85]
Yasuo Yamane: A Hash Join Technique for Relational Database Systems. FODO 1985: 515-528 BibTeX

Referenced by

  1. Sven Helmer, Till Westmann, Guido Moerkotte: Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. VLDB 1998: 98-109
  2. Goetz Graefe, Ross Bunker, Shaun Cooper: Hash Joins and Hash Teams in Microsoft SQL Server. VLDB 1998: 86-97
  3. Sven Helmer, Guido Moerkotte: Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. VLDB 1997: 386-395
  4. Jignesh M. Patel, Jie-Bing Yu, Navin Kabra, Kristin Tufte, Biswadeep Nag, Josef Burger, Nancy E. Hall, Karthikeyan Ramasamy, Roger Lueder, Curt J. Ellmann, Jim Kupsch, Shelly Guo, David J. DeWitt, Jeffrey F. Naughton: Building a Scaleable Geo-Spatial DBMS: Technology, Implementation, and Evaluation. SIGMOD Conference 1997: 336-347
  5. Takayuki Tamura, Masaru Kitsuregawa: Implementation and Evaluation of the Bucket Flattening Omega Network of the Parallel Relational Database Server SDC-II. DASFAA 1997: 471-480
  6. Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
  7. Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference 1996: 247-258
  8. Ming-Ling Lo, Chinya V. Ravishankar: Towards Eliminating Random I/O in Hash Joins. ICDE 1996: 422-429
  9. Goetz Graefe, Richard L. Cole: Fast Algorithms for Universal Quantification in Large Databases. ACM Trans. Database Syst. 20(2): 187-236(1995)
  10. Goetz Graefe, Ann Linville, Leonard D. Shapiro: Sort versus Hash Revisited. IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994)
  11. Diane L. Davison, Goetz Graefe: Memory-Contention Responsive Hash Joins. VLDB 1994: 379-390
  12. Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417
  13. Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993)
  14. HweeHwa Pang, Michael J. Carey, Miron Livny: Partially Preemptive Hash Joins. SIGMOD Conference 1993: 59-68
  15. Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992)
  16. Masaru Kitsuregawa, Shin-ichiro Tsudaka, Miyuki Nakano: Parallel GRACE Hash Join on Shared-Everything Multiprocessor: Implementation and Performance Evaluation on Symmetry S81. ICDE 1992: 256-264
  17. Hongjun Lu, Kian-Lee Tan: Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems. EDBT 1992: 357-372
  18. Masaru Kitsuregawa, Miyuki Nakano, Mikio Takagi: Performance Evaluation of Functional Disk System (FDS-R2). ICDE 1991: 416-425
  19. Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221
  20. Lilian Harada, Miyuki Nakano, Masaru Kitsuregawa, Mikio Takagi: Query Processing for Multi-Attribute Clustered Records. VLDB 1990: 59-70
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:41 2009