ACM SIGMOD Anthology TODS dblp.uni-trier.de

Join Processing in Database Systems with Large Main Memories.

Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. 11(3): 239-264(1986)
@article{DBLP:journals/tods/Shapiro86,
  author    = {Leonard D. Shapiro},
  title     = {Join Processing in Database Systems with Large Main Memories},
  journal   = {ACM Trans. Database Syst.},
  volume    = {11},
  number    = {3},
  year      = {1986},
  pages     = {239-264},
  ee        = {http://doi.acm.org/10.1145/6314.6315, db/journals/tods/Shapiro86.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We study algorithms for computing the equijoin of two relations in a system with a standard architecture but with large amounts of main memory. Our algorithms are especially efficient when the main memory available is a significant fraction of the size of one of the relations to he joined; but they can be applied whenever there is memory equal to approximately the square root of the size of one relation. We present a new algorithm which is a hybrid of two hash-based algorithms and which dominates the other algorithma we present, including sort-merge. Even in a virtual memory environment, the hybrid algorithm dominates all the others we study.

Finally, we describe how three popular tools to increase the efficiency ofjoins, namely filters, Babb arrays, and semijoins, can he grafted onto any of our algorithms.

Copyright © 1986 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Joint ACM SIGMOD / IEEE Computer Society Anthology

CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Edward Babb: Implementing a Relational Database by Means of Specialized Hardware. ACM Trans. Database Syst. 4(1): 1-29(1979) BibTeX
[2]
Philip A. Bernstein, Nathan Goodman, Eugene Wong, Christopher L. Reeve, James B. Rothnie Jr.: Query Processing in a System for Distributed Databases (SDD-1). ACM Trans. Database Syst. 6(4): 602-625(1981) BibTeX
[3]
Dina Bitton, Haran Boral, David J. DeWitt, W. Kevin Wilkinson: Parallel Algorithms for the Execution of Relational Database Operations. ACM Trans. Database Syst. 8(3): 324-353(1983) BibTeX
[4]
Mike W. Blasgen, Kapali P. Eswaran: Storage and Access in Relational Data Bases. IBM Systems Journal 16(4): 362-377(1977) BibTeX
[5]
Kjell Bratbergsengen: Hashing Methods and Relational Algebra Operations. VLDB 1984: 323-333 BibTeX
[6]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[7]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 BibTeX
[8]
...
[9]
Wolfgang Effelsberg, Theo Härder: Principles of Database Buffer Management. ACM Trans. Database Syst. 9(4): 560-595(1984) BibTeX
[10]
Hector Garcia-Molina, Richard J. Lipton, Jacobo Valdes: A Massive Memory Machine. IEEE Trans. Computers 33(5): 391-399(1984) BibTeX
[11]
...
[12]
Larry Kerschberg, Peter D. Ting, S. Bing Yao: Query Optimization in Star Computer Networks. ACM Trans. Database Syst. 7(4): 678-711(1982) BibTeX
[13]
...
[14]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[15]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[16]
Gregory Piatetsky-Shapiro, Charles Connell: Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD Conference 1984: 256-276 BibTeX
[17]
Giovanni Maria Sacco, Mario Schkolnick: A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model. VLDB 1982: 257-262 BibTeX
[18]
Dennis G. Severance, Richardo Duhne: A Practitioner's Guide To Addressing Algorithms. Commun. ACM 19(6): 314-326(1976) BibTeX
[19]
D. L. Slotnick: Logic per Track Devices. Advances in Computers 10: 291-296(1970) BibTeX
[20]
Michael Stonebraker: Operating System Support for Database Management. Commun. ACM 24(7): 412-418(1981) BibTeX
[21]
Patrick Valduriez, Georges Gardarin: Join and Semijoin Algorithms for a Multiprocessor Database Machine. ACM Trans. Database Syst. 9(1): 133-161(1984) BibTeX
[22]
Yasuo Yamane: A Hash Join Technique for Relational Database Systems. FODO 1985: 515-528 BibTeX

Referenced by

  1. Jayavel Shanmugasundaram, Eugene J. Shekita, Rimon Barr, Michael J. Carey, Bruce G. Lindsay, Hamid Pirahesh, Berthold Reinwald: Efficiently Publishing Relational Data as XML Documents. VLDB 2000: 65-76
  2. Jochen Van den Bercken, Martin Schneider, Bernhard Seeger: Plug&Join: An easy-to-use Generic Algorithm for Efficiently Processing Equi and Non-Equi Joins. EDBT 2000: 495-509
  3. Zhe Li, Kenneth A. Ross: Fast Joins Using Join Indices. VLDB J. 8(1): 1-24(1999)
  4. Surajit Chaudhuri, Kyuseok Shim: Optimization of Queries with User-Defined Predicates. ACM Trans. Database Syst. 24(2): 177-228(1999)
  5. Alfons Kemper, Donald Kossmann, Christian Wiesner: Generalised Hash Teams for Join and Group-by. VLDB 1999: 30-41
  6. Goetz Graefe: The Value of Merge-Join and Hash-Join in SQL Server. VLDB 1999: 250-253
  7. Donko Donjerkovic, Raghu Ramakrishnan: Probabilistic Optimization of Top N Queries. VLDB 1999: 411-422
  8. Ming-Chuan Wu: Query Optimization for Selections Using Bitmaps. SIGMOD Conference 1999: 227-238
  9. Francis C. Chu, Joseph Y. Halpern, Praveen Seshadri: Least Expected Cost Query Optimization: An Exercise in Utility. PODS 1999: 138-147
  10. Joseph M. Hellerstein: Optimization Techniques for Queries with Expensive Methods. ACM Trans. Database Syst. 23(2): 113-157(1998)
  11. Alex Delis, Nick Roussopoulos: Techniques for Update Handling in the Enhanced Client-Server DBMS. IEEE Trans. Knowl. Data Eng. 10(3): 458-476(1998)
  12. Sven Helmer, Till Westmann, Guido Moerkotte: Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. VLDB 1998: 98-109
  13. Tolga Urhan, Michael J. Franklin, Laurent Amsaleg: Cost Based Query Scrambling for Initial Delays. SIGMOD Conference 1998: 130-141
  14. Michael Steinbrunn, Guido Moerkotte, Alfons Kemper: Heuristic and Randomized Optimization for the Join Ordering Problem. VLDB J. 6(3): 191-208(1997)
  15. Yannis E. Ioannidis, Raymond T. Ng, Kyuseok Shim, Timos K. Sellis: Parametric Query Optimization. VLDB J. 6(2): 132-151(1997)
  16. Laura M. Haas, Michael J. Carey, Miron Livny, Amit Shukla: Seeking the Truth About ad hoc Join Costs. VLDB J. 6(3): 241-256(1997)
  17. Sven Helmer, Guido Moerkotte: Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. VLDB 1997: 386-395
  18. Minos N. Garofalakis, Yannis E. Ioannidis: Parallel Query Scheduling and Optimization with Time- and Space-Shared Resources. VLDB 1997: 296-305
  19. Jens Claußen, Alfons Kemper, Guido Moerkotte, Klaus Peithner: Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases. VLDB 1997: 286-295
  20. Jussi Myllymaki, Miron Livny: Relational Joins for Data on Tertiary Storage. ICDE 1997: 159-168
  21. Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
  22. Shahram Ghandeharizadeh, Richard Hull, Dean Jacobs: Heraclitus: Elevating Deltas to be First-Class Citizens in a Database Programming Language. ACM Trans. Database Syst. 21(3): 370-426(1996)
  23. Ismail H. Toroslu, Ghassan Z. Qadah: The Strong Partial Transitive-Closure Problem: Algorithms and Performance Evaluation. IEEE Trans. Knowl. Data Eng. 8(4): 617-629(1996)
  24. Jukka Teuhola: Path Signatures: A Way to Speed Up Recursion in Relational Databases. IEEE Trans. Knowl. Data Eng. 8(3): 446-454(1996)
  25. Myra Spiliopoulou, Michael Hatzopoulos, Yannis Cotronis: Parallel Optimization of Large Join Queries with Set Operators and Aggregates in a Parallel Environment Supporting Pipeline. IEEE Trans. Knowl. Data Eng. 8(3): 429-445(1996)
  26. Viswanath Poosala, Yannis E. Ioannidis: Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing. VLDB 1996: 448-459
  27. Wilburt Labio, Hector Garcia-Molina: Efficient Snapshot Differential Algorithms for Data Warehousing. VLDB 1996: 63-74
  28. Georges Gardarin, Fei Sha, Zhao-Hui Tang: Calibrating the Query Optimizer Cost Model of IRO-DB, an Object-Oriented Federated Database System. VLDB 1996: 378-389
  29. Georges Gardarin, Jean-Robert Gruser, Zhao-Hui Tang: Cost-based Selection of Path Expression Processing Algorithms in Object-Oriented Databases. VLDB 1996: 390-401
  30. Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference 1996: 247-258
  31. Michael J. Franklin, Björn Þór Jónsson, Donald Kossmann: Performance Tradeoffs for Client-Server Query Processing. SIGMOD Conference 1996: 149-160
  32. Ming-Ling Lo, Chinya V. Ravishankar: Towards Eliminating Random I/O in Hash Joins. ICDE 1996: 422-429
  33. Peter A. Buhr, Anil K. Goel, Naomi Nishimura, Prabhakar Ragde: Parallel Pointer-Based Join Algorithms in Memory-mapped Environments. ICDE 1996: 266-275
  34. Goetz Graefe, Richard L. Cole: Fast Algorithms for Universal Quantification in Large Databases. ACM Trans. Database Syst. 20(2): 187-236(1995)
  35. Fereidoon Sadri: Information Source Tracking Method: Efficiency Issues. IEEE Trans. Knowl. Data Eng. 7(6): 947-954(1995)
  36. HweeHwa Pang, Michael J. Carey, Miron Livny: Multiclass Query Scheduling in Real-Time Database Systems. IEEE Trans. Knowl. Data Eng. 7(4): 533-551(1995)
  37. Janet L. Wiener, Jeffrey F. Naughton: OODB Bulk Loading Revisited: The Partitioned-List Approach. VLDB 1995: 30-41
  38. Hongjun Lu, Kian-Lee Tan, Son Dao: The Fittest Survives: An Adaptive Approach to Query Optimization. VLDB 1995: 251-262
  39. Sang-Wook Kim, Wan-Sup Cho, Min-Jae Lee, Kyu-Young Whang: A New Algorithm for Processing Joins Using the Multilevel Grid File. DASFAA 1995: 115-123
  40. Byung Suk Lee, Gio Wiederhold: Efficiently Instantiating View-Objects From Remote Relational Databases. VLDB J. 3(3): 289-323(1994)
  41. Patrick Martin, Per-Åke Larson, Vinay Deshpande: Parallel Hash-Based Join Algorithms for a Shared-Everything. IEEE Trans. Knowl. Data Eng. 6(5): 750-763(1994)
  42. Eileen Tien Lin, Edward Omiecinski, Sudhakar Yalamanchili: Large Join Optimization on a Hypercube Multiprocessor. IEEE Trans. Knowl. Data Eng. 6(2): 304-315(1994)
  43. Goetz Graefe, Ann Linville, Leonard D. Shapiro: Sort versus Hash Revisited. IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994)
  44. Goetz Graefe: Volcano - An Extensible and Parallel Query Evaluation System. IEEE Trans. Knowl. Data Eng. 6(1): 120-135(1994)
  45. Hongjun Lu, Beng Chin Ooi, Kian-Lee Tan: On Spatially Partitioned Temporal Join. VLDB 1994: 546-557
  46. Diane L. Davison, Goetz Graefe: Memory-Contention Responsive Hash Joins. VLDB 1994: 379-390
  47. HweeHwa Pang, Michael J. Carey, Miron Livny: Managing Memory for Real-Time Queries. SIGMOD Conference 1994: 221-232
  48. Joseph M. Hellerstein: Practical Predicate Placement. SIGMOD Conference 1994: 325-335
  49. Jukka Teuhola: An Efficient Relational Implementation of Recursive Relationships using Path Signatures. ICDE 1994: 348-355
  50. Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417
  51. Philip S. Yu, Douglas W. Cornell: Buffer Management Based on Return on Consumption in a Multi-Query Environment. VLDB J. 2(1): 1-37(1993)
  52. Christian S. Jensen, Leo Mark, Nick Roussopoulos, Timos K. Sellis: Using Differential Techniques to Efficiently Support Transaction Time. VLDB J. 2(1): 75-111(1993)
  53. Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993)
  54. Eugene J. Shekita, Honesty C. Young, Kian-Lee Tan: Multi-Join Optimization for Symmetric Multiprocessors. VLDB 1993: 479-492
  55. Manish Mehta, David J. DeWitt: Dynamic Memory Allocation for Multiple-Query Workloads. VLDB 1993: 354-367
  56. HweeHwa Pang, Michael J. Carey, Miron Livny: Partially Preemptive Hash Joins. SIGMOD Conference 1993: 59-68
  57. Ismail H. Toroslu, Ghassan Z. Qadah: The Efficient Computation of Strong Partial Transitive-Closures. ICDE 1993: 530-537
  58. Valery Soloviev: A Truncating Hash Algorithm for Processing Band-Join Queries. ICDE 1993: 419-427
  59. Ludger Becker, Klaus Hinrichs, Ulrich Finke: A New Algorithm for Computing Joins with Grid Files. ICDE 1993: 190-197
  60. Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992)
  61. Yannis E. Ioannidis, Raymond T. Ng, Kyuseok Shim, Timos K. Sellis: Parametric Query Optimization. VLDB 1992: 103-114
  62. V. Srinivasan, Michael J. Carey: Compensation-Based On-Line Query Processing. SIGMOD Conference 1992: 331-340
  63. Anastasia Analyti, Sakti Pramanik: Fast Search in Main Memory Databases. SIGMOD Conference 1992: 215-224
  64. Dennis Shasha, Jason Tsong-Li Wang: Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned. ACM Trans. Database Syst. 16(2): 279-308(1991)
  65. Mauro Negri, Giuseppe Pelagatti: Distributive Join: A New Algorithm for Joining Relations. ACM Trans. Database Syst. 16(4): 655-669(1991)
  66. Edward Omiecinski: Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor. VLDB 1991: 375-385
  67. Philip S. Yu, Douglas W. Cornell: Optimal Buffer Allocation in A Multi-Query Environment. ICDE 1991: 622-631
  68. William Perrizo, James Gustafson, Daniel Thureen, David Wenberg: Domain Vector Accelerator for Relational Operations. ICDE 1991: 491-498
  69. Masaru Kitsuregawa, Miyuki Nakano, Mikio Takagi: Performance Evaluation of Functional Disk System (FDS-R2). ICDE 1991: 416-425
  70. Daniel F. Lieuwen, David J. DeWitt: Optimizing Loops in Database Programming Languages. DBPL 1991: 287-305
  71. Kyu-Young Whang, Ravi Krishnamurthy: Query Optimization in a Memory-Resident Domain Relational Calculus Database System. ACM Trans. Database Syst. 15(1): 67-95(1990)
  72. M. Seetha Lakshmi, Philip S. Yu: Effectiveness of Parallel Joins. IEEE Trans. Knowl. Data Eng. 2(4): 410-424(1990)
  73. Hansjörg Zeller, Jim Gray: An Adaptive Hash Join Algorithm for Multiuser Environments. VLDB 1990: 186-197
  74. Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480
  75. Hongjun Lu, Kian-Lee Tan, Ming-Chien Shan: Hash-Based Join Algorithms for Multiprocessor Computers. VLDB 1990: 198-209
  76. Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221
  77. Rajiv Jauhari, Michael J. Carey, Miron Livny: Priority-Hints: An Algorithm for Priority-Based Buffer Management. VLDB 1990: 708-721
  78. Michael J. Carey, Eugene J. Shekita, George Lapis, Bruce G. Lindsay, John McPherson: An Incremental Join Attachment for Starburst. VLDB 1990: 662-673
  79. Eugene J. Shekita, Michael J. Carey: A Performance Evaluation of Pointer-Based Joins. SIGMOD Conference 1990: 300-311
  80. Michael Stonebraker: Future Trends in Database Systems. IEEE Trans. Knowl. Data Eng. 1(1): 33-44(1989)
  81. Edward Omiecinski, Eileen Tien Lin: Hash-Based and Index-Based Join Algorithms for Cube and Ring Connected Multicomputers. IEEE Trans. Knowl. Data Eng. 1(3): 329-343(1989)
  82. Farshad Fotouhi, Sakti Pramanik: Optimal Secondary Storage Access Sequence for Performing Relational Join. IEEE Trans. Knowl. Data Eng. 1(3): 318-328(1989)
  83. Masaru Kitsuregawa, Masaya Nakayama, Mikio Takagi: The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method. VLDB 1989: 257-266
  84. Elena Barcucci, Alessandra Chiuderi, Renzo Pinzani, M. Cecilia Verri: Index Selection in Relational Databases. MFDBS 1989: 24-36
  85. M. Seetha Lakshmi, Philip S. Yu: Limiting Factors of Join Performance on Parallel Processors. ICDE 1989: 488-496
  86. Masaru Kitsuregawa, Miyuki Nakano, Mikio Takagi: Query Execution for Large Relations on Functional Disk Systems. ICDE 1989: 159-167
  87. Masaru Kitsuregawa, Miyuki Nakano, Mikio Takagi: Funtional Disk System as a High Performance Relational Storage. DASFAA 1989: 243-250
  88. Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press 1989, ISBN 0-7167-8162-X
    Contents
  89. Don S. Batory, T. Y. Leung, T. E. Wise: Implementation Concepts for an Extensible Data Model and Data Language. ACM Trans. Database Syst. 13(3): 231-262(1988)
  90. Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478
  91. Michael Stonebraker: Future Trends in Data Base Systems. ICDE 1988: 222-231
  92. Pankaj Goyal, H. F. Li, E. Regener, Fereidoon Sadri: Scheduling of Page Fetches in Join Operations Using Bc-Trees. ICDE 1988: 304-310
  93. Tobin J. Lehman, Michael J. Carey: A Recovery Algorithm for A High-Performance Memory-Resident Database System. SIGMOD Conference 1987: 104-117
  94. Dina Bitton, Maria Hanrahan, Carolyn Turbyfill: Performance of Complex Queries in Main Memory Database Systems. ICDE 1987: 72-81
  95. Tobin J. Lehman, Michael J. Carey: A Study of Index Structures for Main Memory Database Management Systems. VLDB 1986: 294-303
  96. Tobin J. Lehman, Michael J. Carey: Query Processing in Main Memory Database Management Systems. SIGMOD Conference 1986: 239-250
  97. Dina Bitton: The Effect of Large Main memory on Database Systems. SIGMOD Conference 1986: 337-339
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:38:59 2008