ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Join Algorithm Costs Revisited.

Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
@article{DBLP:journals/vldb/HarrisR96,
  author    = {Evan P. Harris and
               Kotagiri Ramamohanarao},
  title     = {Join Algorithm Costs Revisited},
  journal   = {VLDB J.},
  volume    = {5},
  number    = {1},
  year      = {1996},
  pages     = {64-84},
  ee        = {db/journals/vldb/HarrisR96.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

A method of analysing join algorithms based upon the time required to access, transfer and perform the relevant CPU-based operations on a disk page is proposed. The costs of variations of several of the standard join algorithms, including nested block, sort-merge, GRACE hash and hybrid hash, are presented. For a given total buffer size, the cost of these join algorithms depends on the parts of the buffer allocated for each purpose. For example, when joining two relations using the nested block join algorithm, the amount of buffer space allocated for the outer and inner relations can significantly affect the cost of the join. Analysis of expected and experimental results of various join algorithms show that a combination of the optimal nested block and optimal GRACE hash join algorithms usually provide the greatest cost benefit, unless the relation size is a small multiple of the memory size. Algorithms to quickly determine a buffer allocation producing the minimal cost for each of these algorithms are presented. When the relation size is a small multiple of the amount of main memory available (typically up to three to six times), the hybrid hash join algorithm is preferable.

Key Words

Join algorithms, minimisation, optimal buffer allocation.

Copyright © 1996 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[Aarts and Korst 1989]
...
[Bitton et al. 1983]
Dina Bitton, David J. DeWitt, Carolyn Turbyfill: Benchmarking Database Systems A Systematic Approach. VLDB 1983: 8-19 BibTeX
[Blasgen and Eswaran 1977]
Mike W. Blasgen, Kapali P. Eswaran: Storage and Access in Relational Data Bases. IBM Systems Journal 16(4): 362-377(1977) BibTeX
[Cheng et al. 1991]
Josephine M. Cheng, Donald J. Haderle, Richard Hedges, Balakrishna R. Iyer, Ted Messinger, C. Mohan, Yun Wang: An Efficient Hybrid Join Algorithm: A DB2 Prototype. ICDE 1991: 171-180 BibTeX
[DeWitt and Gerber 1985]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 BibTeX
[DeWitt et al. 1984]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[Graefe 1993]
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993) BibTeX
[Haas and Swami 1992]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 BibTeX
[Hagmann 1986]
Robert B. Hagmann: An Observation on Database Buffering Performance Metrics. VLDB 1986: 289-293 BibTeX
[Harris and Ramamohanarao 1994]
Evan P. Harris, Kotagiri Ramamohanarao: Using Optimized Multi-Attribute Hash Indexes for Hash Joins. Australasian Database Conference 1994: 92-111 BibTeX
[Hua and Lee 1991]
Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535 BibTeX
[Kitsuregawa et al. 1983]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[Kitsuregawa et al. 1989]
Masaru Kitsuregawa, Masaya Nakayama, Mikio Takagi: The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method. VLDB 1989: 257-266 BibTeX
[Knuth 1973]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[Lipton et al. 1990]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
[McVoy and Kleiman 1991]
Larry W. McVoy, Steve R. Kleiman: Extent-like Performance from a UNIX File System. USENIX Winter 1991: 33-44 BibTeX
[Merrett 1981]
T. H. Merrett: Why Sort-Merge Gives the Best Implementation of the Natural Join. SIGMOD Record 13(2): 39-51(1983) BibTeX
[Mishra and Eich 1992]
Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992) BibTeX
[Nakayama et al. 1988]
Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478 BibTeX
[Omiecinski 1991]
Edward Omiecinski: Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor. VLDB 1991: 375-385 BibTeX
[Omiecinski and Lin 1992]
...
[Pang et al. 1993]
HweeHwa Pang, Michael J. Carey, Miron Livny: Partially Preemptive Hash Joins. SIGMOD Conference 1993: 59-68 BibTeX
[Richardson et al. 1987]
James P. Richardson, Hongjun Lu, Krishna P. Mikkilineni: Design and Evaluation of Parallel Pipelined Join Algorithms. SIGMOD Conference 1987: 399-409 BibTeX
[Schneider and DeWitt 1989]
Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121 BibTeX
[Shapiro 1986]
Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. 11(3): 239-264(1986) BibTeX
[Shatdal and Naughton 1993]
Ambuj Shatdal, Jeffrey F. Naughton: Using Shared Virtual Memory for Parallel Join Processing. SIGMOD Conference 1993: 119-128 BibTeX
[Sun et al. 1993]
Wei Sun, Yibei Ling, Naphtali Rishe, Yi Deng: An Instant and Accurate Estimation Method for Joins and Selection in a Retrieval-Intensive Environment. SIGMOD Conference 1993: 79-88 BibTeX
[Vaghani et al. 1994]
Jayen Vaghani, Kotagiri Ramamohanarao, David B. Kemp, Zoltan Somogyi, Peter J. Stuckey, Tim S. Leask, James Harland: The Aditi Deductive Database System. VLDB J. 3(2): 245-288(1994) BibTeX
[Walton et al. 1991]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 BibTeX

Referenced by

  1. Reinhard Braumandl, Jens Claußen, Alfons Kemper, Donald Kossmann: Functional-Join Processing. VLDB J. 8(3-4): 156-177(2000)
  2. Till Westmann, Donald Kossmann, Sven Helmer, Guido Moerkotte: The Implementation and Performance of Compressed Databases. SIGMOD Record 29(3): 55-67(2000)
  3. Alfons Kemper, Donald Kossmann, Christian Wiesner: Generalised Hash Teams for Join and Group-by. VLDB 1999: 30-41
  4. Volker Markl, Martin Zirkel, Rudolf Bayer: Processing Operations with Restrictions in RDBMS without External Sorting: The Tetris Algorithm. ICDE 1999: 562-571
  5. Achim Kraiss, Peter Muth, Michael Gillmann: Tape-Disk Join Strategies under Disk Contention. ICDE 1999: 552-559
  6. Sven Helmer, Till Westmann, Guido Moerkotte: Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. VLDB 1998: 98-109
  7. Reinhard Braumandl, Jens Claußen, Alfons Kemper: Evaluating Functional Joins Along Nested Reference Sets in Object-Relational and Object-Oriented Databases. VLDB 1998: 110-122
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:27 2009