Digital Symposium Collection 2000  

 
 
 
 
 
 

 





















The Value of Merge-Join and Hash-Join in SQL Server

Goetz Graefe

  View Paper (PDF)  

Return to Industrial Session: Some Innovations in Microsoft SQL Server

Abstract
Microsoft SQL Server was successful for many years for transaction processing and decision support workloads with neither merge join nor hash join, relying entirely on nested loops and index nested loops join. How much difference do additional join algorithms really make, and how much system performance do they actually add? In a pure OLTP workload that requires only record-to-record navigation, intuition agrees that index nested loops join is sufficient. For a DSS workload, however, the question is much more complex. To answer this question, we have analyzed TPC-D query performance using an internal build of SQL Server with merge-join and hash-join enabled and disabled. It shows that merge join and hash join are both required to achieve the best performance for decision support workloads.


References

Note: References link to DBLP on the Web.

[Blasgen and Eswaran 1977]
Mike W. Blasgen , Kapali P. Eswaran : Storage and Access in Relational Data Bases. IBM Systems Journal 16(4) : 362-377(1977)
[Chaudhuri and Narasayya 1998]
Surajit Chaudhuri , Vivek R. Narasayya : Microsoft Index Tuning Wizard for SQL Server 7.0. SIGMOD Conference 1998 : 553-554
[DeWitt et al. 1984]
David J. DeWitt , Randy H. Katz , Frank Olken , Leonard D. Shapiro , Michael Stonebraker , David A. Wood : Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984 : 1-8
[DeWitt et al. 1993]
David J. DeWitt , Jeffrey F. Naughton , J. Burger : Nested Loops Revisited. PDIS 1993 : 230-242
[Graefe et al. 1998]
Goetz Graefe , Ross Bunker , Shaun Cooper : Hash Joins and Hash Teams in Microsoft SQL Server. VLDB 1998 : 86-97
[Kitsuregawa et al. 1983]
Masaru Kitsuregawa , Hidehiko Tanaka , Tohru Moto-Oka : Application of Hash to Data Base Machine and Its Architecture. New Generation Computing 1(1) : 63-74(1983)
[Sacco 1986]
Giovanni Maria Sacco : Fragmentation: A Technique for Efficient Query Processing. TODS 11(2) : 113-133(1986)
[Selinger et al. 1979]
Patricia G. Selinger , Morton M. Astrahan , Donald D. Chamberlin , Raymond A. Lorie , Thomas G. Price : Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979 : 23-34
[Shapiro 1986]
Leonard D. Shapiro : Join Processing in Database Systems with Large Main Memories. TODS 11(3) : 239-264(1986)
[TPC-D]
Transaction Processing Performance Council. http://www.tpc.org

BIBTEX

@inproceedings{DBLP:conf/vldb/Graefe99,
  author    = {Goetz Graefe},
   editor    = {Malcolm P. Atkinson and
                Maria E. Orlowska and
                Patrick Valduriez and
                Stanley B. Zdonik and
                Michael L. Brodie},
   title     = {The Value of Merge-Join and Hash-Join in SQL Server},
   booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
                Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
                UK},
   publisher = {Morgan Kaufmann},
   year      = {1999},
   isbn      = {1-55860-615-5},
   pages     = {250-253},
   crossref  = {DBLP:conf/vldb/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM