ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Multiprocessor Hash-Based Join Algorithms.

David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164
@inproceedings{DBLP:conf/vldb/DeWittG85,
  author    = {David J. DeWitt and
               Robert H. Gerber},
  editor    = {Alain Pirotte and
               Yannis Vassiliou},
  title     = {Multiprocessor Hash-Based Join Algorithms},
  booktitle = {VLDB'85, Proceedings of 11th International Conference on Very
               Large Data Bases, August 21-23, 1985, Stockholm, Sweden},
  publisher = {Morgan Kaufmann},
  year      = {1985},
  pages     = {151-164},
  ee        = {db/conf/vldb/DeWittG85.html},
  crossref  = {DBLP:conf/vldb/85},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

This paper extends earlier research on hash-join algorithms to a multiprocessor architecture. Implementations of a number of centralized join algorithms are described and measured. Evaluation of these algorithms served to verify earlier analylical results. In addition, they demonstrate that bit vector filtering provides dramatic improvement in the performance of all algorithms including the sort mergejoin algorithm. Multiprocessor configurations of the centralized Grace and Hybrid hash-join algorithms are also presented. Both algorithms arc shown to provide linear increases in throughput with corresponding increases in processor and disk resources.

Copyright © 1985 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Alain Pirotte, Yannis Vassiliou (Eds.): VLDB'85, Proceedings of 11th International Conference on Very Large Data Bases, August 21-23, 1985, Stockholm, Sweden. Morgan Kaufmann 1985
Contents BibTeX

References

[ASTR76]
Morton M. Astrahan, Mike W. Blasgen, Donald D. Chamberlin, Kapali P. Eswaran, Jim Gray, Patricia P. Griffiths, W. Frank King III, Raymond A. Lorie, Paul R. McJones, James W. Mehl, Gianfranco R. Putzolu, Irving L. Traiger, Bradford W. Wade, Vera Watson: System R: Relational Approach to Database Management. ACM Trans. Database Syst. 1(2): 97-137(1976) BibTeX
[BABB79]
Edward Babb: Implementing a Relational Database by Means of Specialized Hardware. ACM Trans. Database Syst. 4(1): 1-29(1979) BibTeX
[BITT83]
Dina Bitton, David J. DeWitt, Carolyn Turbyfill: Benchmarking Database Systems A Systematic Approach. VLDB 1983: 8-19 BibTeX
[BLAS77]
Mike W. Blasgen, Kapali P. Eswaran: Storage and Access in Relational Data Bases. IBM Systems Journal 16(4): 362-377(1977) BibTeX
[BRAT84]
Kjell Bratbergsengen: Hashing Methods and Relational Algebra Operations. VLDB 1984: 323-333 BibTeX
[BROW85]
...
[CHOU83]
Hong-Tai Chou, David J. DeWitt, Randy H. Katz, Anthony C. Klug: Design and Implementation of the Wisconsin Storage System. Softw., Pract. Exper. 15(10): 943-962(1985) BibTeX
[DEWI84a]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[DEWI84b]
David J. DeWitt, Raphael A. Finkel, Marvin H. Solomon: The Crystal Multicomputer: Design and Implementation Experience. IEEE Trans. Software Eng. 13(8): 953-966(1987) BibTeX
[GARC84]
...
[GOOD81]
...
[GROS53]
...
[HE83]
...
[KIM85]
...
[KITS83a]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[KITS83b]
...
[KITS83c]
...
[KNUT73]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[PROT83]
...
[RIES78]
...
[SIEW82]
...
[STON76]
Michael Stonebraker, Eugene Wong, Peter Kreps, Gerald Held: The Design and Implementation of INGRES. ACM Trans. Database Syst. 1(3): 189-222(1976) BibTeX
[VALD84]
Patrick Valduriez, Georges Gardarin: Join and Semijoin Algorithms for a Multiprocessor Database Machine. ACM Trans. Database Syst. 9(1): 133-161(1984) BibTeX

Referenced by

  1. Erik Riedel, Garth A. Gibson, Christos Faloutsos: Active Storage for Large-Scale Data Mining and Multimedia. VLDB 1998: 62-73
  2. Sven Helmer, Till Westmann, Guido Moerkotte: Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. VLDB 1998: 98-109
  3. Goetz Graefe, Ross Bunker, Shaun Cooper: Hash Joins and Hash Teams in Microsoft SQL Server. VLDB 1998: 86-97
  4. Manish Mehta, David J. DeWitt: Data Placement in Shared-Nothing Parallel Database Systems. VLDB J. 6(1): 53-72(1997)
  5. Sakti Pramanik, Walid R. Tout: The NUMA with Clusters of Processors for Parallel Join. IEEE Trans. Knowl. Data Eng. 9(4): 653-660(1997)
  6. Sven Helmer, Guido Moerkotte: Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. VLDB 1997: 386-395
  7. Takayuki Tamura, Masaru Kitsuregawa: Implementation and Evaluation of the Bucket Flattening Omega Network of the Parallel Relational Database Server SDC-II. DASFAA 1997: 471-480
  8. Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
  9. Shahram Ghandeharizadeh, Richard Hull, Dean Jacobs: Heraclitus: Elevating Deltas to be First-Class Citizens in a Database Programming Language. ACM Trans. Database Syst. 21(3): 370-426(1996)
  10. Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Optimization of Parallel Execution for Multi-Join Queries. IEEE Trans. Knowl. Data Eng. 8(3): 416-428(1996)
  11. Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference 1996: 247-258
  12. Ming-Ling Lo, Chinya V. Ravishankar: Towards Eliminating Random I/O in Hash Joins. ICDE 1996: 422-429
  13. Chengwen Liu, Hao Chen: A Hash Partition Strategy for Distributed Query Processing. EDBT 1996: 373-387
  14. Yigit Kulabas, Özgür Ulusoy: Heuristic Algorithms for Inter-Query Scheduling in Database Systems. ADBIS 1996: 141-145
  15. Nadejda Biscondi, André Flory, Lionel Brunie: Parallel Databases: Structured Query Optimization. ADBIS 1996: 146-152
  16. Goetz Graefe, Richard L. Cole: Fast Algorithms for Universal Quantification in Large Databases. ACM Trans. Database Syst. 20(2): 187-236(1995)
  17. Kien A. Hua, Chiang Lee, Chau M. Hua: Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning. IEEE Trans. Knowl. Data Eng. 7(6): 968-983(1995)
  18. Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins. IEEE Trans. Knowl. Data Eng. 7(4): 656-668(1995)
  19. Uwe Herzog, Ralf Schaarschmidt: Parallel Execution of Integrity Constraint Checks. CIKM 1995: 82-89
  20. Patrick Martin, Per-Åke Larson, Vinay Deshpande: Parallel Hash-Based Join Algorithms for a Shared-Everything. IEEE Trans. Knowl. Data Eng. 6(5): 750-763(1994)
  21. Goetz Graefe, Ann Linville, Leonard D. Shapiro: Sort versus Hash Revisited. IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994)
  22. Diane L. Davison, Goetz Graefe: Memory-Contention Responsive Hash Joins. VLDB 1994: 379-390
  23. Hui-I Hsiao, Ming-Syan Chen, Philip S. Yu: On Parallel Execution of Multiple Pipelined Hash Joins. SIGMOD Conference 1994: 185-196
  24. Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417
  25. X. Zhao, Roger G. Johnson, Nigel J. Martin: DBJ - A Dynamic Balancing Hash Join Algorithm in Multiprocessor Database Systems (Extented Abstract). EDBT 1994: 301-308
  26. Mikal Ziane, Mohamed Zaït, Pascale Borla-Salamet: Parallel Query Processing with Zigzag Trees. VLDB J. 2(3): 277-301(1993)
  27. Doron Rotem, Gerhard A. Schloss, Arie Segev: Data Allocation for Multi-Disk Databases. IEEE Trans. Knowl. Data Eng. 5(5): 882-887(1993)
  28. Albert Chen, Yung-Feng Kao, Mike Pong, Diana Shak, Sunil Sharma, Jay Vaishnav, Hansjörg Zeller: Query Processing in NonStop SQL. IEEE Data Eng. Bull. 16(4): 29-41(1993)
  29. Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993)
  30. Eugene J. Shekita, Honesty C. Young, Kian-Lee Tan: Multi-Join Optimization for Symmetric Multiprocessors. VLDB 1993: 479-492
  31. Ambuj Shatdal, Jeffrey F. Naughton: Using Shared Virtual Memory for Parallel Join Processing. SIGMOD Conference 1993: 119-128
  32. Ming-Ling Lo, Ming-Syan Chen, Chinya V. Ravishankar, Philip S. Yu: On Optimal Processor Allocation to Support Pipelined Hash Joins. SIGMOD Conference 1993: 69-78
  33. Valery Soloviev: A Truncating Hash Algorithm for Processing Band-Join Queries. ICDE 1993: 419-427
  34. Manish Mehta, Valery Soloviev, David J. DeWitt: Batch Scheduling in Parallel Database Systems. ICDE 1993: 400-410
  35. Soon Myoung Chung, Jaerheen Yang: Distributive Join Algorithm for Cube-Connected Multiprocessors. DASFAA 1993: 253-260
  36. Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992)
  37. David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40
  38. Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins. VLDB 1992: 15-26
  39. Manuel A. Penaloza, Esen A. Ozkarahan: Parallel Algorithms for Executing Joins on Cube-Conneced Multicomputers. ICDE 1992: 20-27
  40. Masaru Kitsuregawa, Shin-ichiro Tsudaka, Miyuki Nakano: Parallel GRACE Hash Join on Shared-Everything Multiprocessor: Implementation and Performance Evaluation on Symmetry S81. ICDE 1992: 256-264
  41. Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries. ICDE 1992: 58-67
  42. Hongjun Lu, Kian-Lee Tan: Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems. EDBT 1992: 357-372
  43. Dennis Shasha, Jason Tsong-Li Wang: Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned. ACM Trans. Database Syst. 16(2): 279-308(1991)
  44. Edward Omiecinski: Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor. VLDB 1991: 375-385
  45. Hongjun Lu, Ming-Chien Shan, Kian-Lee Tan: Optimization of Multi-Way Join Queries for Parallel Execution. VLDB 1991: 549-560
  46. David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider: An Evaluation of Non-Equijoin Algorithms. VLDB 1991: 443-452
  47. Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209
  48. Rakesh Agrawal, Shaul Dar, H. V. Jagadish: Direct Transitive Closure Algorithms: Design and Performance Evaluation. ACM Trans. Database Syst. 15(3): 427-458(1990)
  49. M. Seetha Lakshmi, Philip S. Yu: Effectiveness of Parallel Joins. IEEE Trans. Knowl. Data Eng. 2(4): 410-424(1990)
  50. David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990)
  51. Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990)
  52. Hansjörg Zeller, Jim Gray: An Adaptive Hash Join Algorithm for Multiuser Environments. VLDB 1990: 186-197
  53. Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480
  54. Hongjun Lu, Kian-Lee Tan, Ming-Chien Shan: Hash-Based Join Algorithms for Multiprocessor Computers. VLDB 1990: 198-209
  55. Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221
  56. Eugene J. Shekita, Michael J. Carey: A Performance Evaluation of Pointer-Based Joins. SIGMOD Conference 1990: 300-311
  57. Balakrishna R. Iyer, Daniel M. Dias: System Issues in Parallel Sorting for Database Systems. ICDE 1990: 246-255
  58. Edward Omiecinski, Eileen Tien Lin: Hash-Based and Index-Based Join Algorithms for Cube and Ring Connected Multicomputers. IEEE Trans. Knowl. Data Eng. 1(3): 329-343(1989)
  59. Masaru Kitsuregawa, Masaya Nakayama, Mikio Takagi: The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method. VLDB 1989: 257-266
  60. Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121
  61. M. Seetha Lakshmi, Philip S. Yu: Limiting Factors of Join Performance on Parallel Processors. ICDE 1989: 488-496
  62. Goetz Graefe: Relational Division: Four Algorithms and Their Performance. ICDE 1989: 94-101
  63. Rakesh Agrawal, Shaul Dar, H. V. Jagadish: Composition of Database Relations. ICDE 1989: 102-108
  64. Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press 1989, ISBN 0-7167-8162-X
    Contents
  65. Michael Stonebraker, Randy H. Katz, David A. Patterson, John K. Ousterhout: The Design of XPRS. VLDB 1988: 318-330
  66. Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478
  67. Guy M. Lohman: Grammar-like Functional Rules for Representing Query Optimization Alternatives. SIGMOD Conference 1988: 18-27
  68. David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider: A Performance Analysis of the Gamma Database Machine. SIGMOD Conference 1988: 350-360
  69. Hidetoshi Monoi, Yukihiro Morita, Hidenori Itoh, Hiroshi Sakai, Shigeki Shibayama: Parallel Control Technique and Performance of an MPPM Knowledge-Base Machine. ICDE 1988: 210-217
  70. Ghassan Z. Qadah: Filter-Based Join Algorithms on Uniprocessor and Distributed-Memory Multiprocessor Database Machines. EDBT 1988: 388-413
  71. Rakesh Agrawal, H. V. Jagadish: Direct Algorithms for Computing the Transitive Closure of Database Relations. VLDB 1987: 255-266
  72. Tobin J. Lehman, Michael J. Carey: A Recovery Algorithm for A High-Performance Memory-Resident Database System. SIGMOD Conference 1987: 104-117
  73. Roger Shultz, Ila Miller: Tree Structured Multiple Processor Join Methods. ICDE 1987: 190-199
  74. Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. 11(3): 239-264(1986)
  75. Patrick Valduriez, Setrag Khoshafian, George P. Copeland: Implementation Techniques of Complex Objects. VLDB 1986: 101-110
  76. Lothar F. Mackert, Guy M. Lohman: R* Optimizer Validation and Performance Evaluation for Distributed Queries. VLDB 1986: 149-159
  77. David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna: GAMMA - A High Performance Dataflow Database Machine. VLDB 1986: 228-237
  78. Jean-Pierre Cheiney, Pascal Faudemay, Rodolphe Michel, Jean-Marc Thévenin: A Reliable Backend Using Multiattribute Clustering and Select-Join Operator. VLDB 1986: 220-227
  79. Lothar F. Mackert, Guy M. Lohman: R* Optimizer Validation and Performance Evaluation for Local Queries. SIGMOD Conference 1986: 84-95
  80. Tobin J. Lehman, Michael J. Carey: Query Processing in Main Memory Database Management Systems. SIGMOD Conference 1986: 239-250
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:24 2009