Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned.
Dennis Shasha, Jason Tsong-Li Wang:
Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned.
ACM Trans. Database Syst. 16(2): 279-308(1991)@article{DBLP:journals/tods/ShashaW91,
author = {Dennis Shasha and
Jason Tsong-Li Wang},
title = {Optimizing Equijoin Queries In Distributed Databases Where Relations
Are Hash Partitioned},
journal = {ACM Trans. Database Syst.},
volume = {16},
number = {2},
year = {1991},
pages = {279-308},
ee = {http://doi.acm.org/10.1145/114325.103713, db/journals/tods/ShashaW91.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Consider the class of distributed database systems consisting of a set of nodes connected by a high bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger memory such as a disk. There is no shared memory among the nodes. The data are horizontally partitioned often using a hash function. Such a description characterizes many parallel or distributed database systems that have recently been proposed, both commercial and academic. We study the optimization problem that arises when the query processor must repartition the relations and intermediate results participating in a multijoin query. Using estimates of the sizes of intermediate relations, we show (1) optimum solutions for closed chain queries; (2) the NP-completeness of the optimization problem for star, tree, and general graph queries; and (3) effective heuristics for these hard cases.
Our general approach and many of our results extend to other attribute partitioning schemes, for example, sort-partitioning on attributes, and to partitioned object databases.
Copyright © 1991 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 3 Issue 2, TODS 1991-1995, TKDE 1989-1992" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
BibTeX
[Abstract, Index Terms and Review]
[Full Text in PDF Format, 1796 KB]
References
- [1]
- Peter M. G. Apers, Alan R. Hevner, S. Bing Yao:
Optimization Algorithms for Distributed Queries.
IEEE Trans. Software Eng. 9(1): 57-68(1983) BibTeX
- [2]
- Edward Babb:
Implementing a Relational Database by Means of Specialized Hardware.
ACM Trans. Database Syst. 4(1): 1-29(1979) BibTeX
- [3]
- Philip A. Bernstein, Nathan Goodman, Eugene Wong, Christopher L. Reeve, James B. Rothnie Jr.:
Query Processing in a System for Distributed Databases (SDD-1).
ACM Trans. Database Syst. 6(4): 602-625(1981) BibTeX
- [4]
- Dina Bitton, Haran Boral, David J. DeWitt, W. Kevin Wilkinson:
Parallel Algorithms for the Execution of Relational Database Operations.
ACM Trans. Database Syst. 8(3): 324-353(1983) BibTeX
- [5]
- Arbee L. P. Chen, Victor O. K. Li:
An Optimal Algorithm for Processing Distributed Star Queries.
IEEE Trans. Software Eng. 11(10): 1097-1107(1985) BibTeX
- [6]
- Dah-Ming W. Chiu, Philip A. Bernstein, Yu-Chi Ho:
Optimizing Chain Queries in a Distributed Database System.
SIAM J. Comput. 13(1): 116-134(1984) BibTeX
- [7]
- D. M. Chiu, Y. C. Ho:
A Methodology for Interpreting Tree Queries Into Optimal Semi-Join Expressions.
SIGMOD Conference 1980: 169-178 BibTeX
- [8]
- E. F. Codd:
Extending the Database Relational Model to Capture More Meaning.
ACM Trans. Database Syst. 4(4): 397-434(1979) BibTeX
- [9]
- George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller:
Data Placement In Bubba.
SIGMOD Conference 1988: 99-108 BibTeX
- [10]
- David J. DeWitt, Robert H. Gerber:
Multiprocessor Hash-Based Join Algorithms.
VLDB 1985: 151-164 BibTeX
- [11]
- David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna:
GAMMA - A High Performance Dataflow Database Machine.
VLDB 1986: 228-237 BibTeX
- [12]
- David J. DeWitt, Marc G. Smith, Haran Boral:
A Single-User Performance Evaluation of the Teradata Database Machine.
HPTS 1987: 244-276 BibTeX
- [13]
- Danièle Gardy, Claude Puech:
On the Effects of Join Operations on Relation Sizes.
ACM Trans. Database Syst. 14(4): 574-603(1989) BibTeX
- [14]
- M. R. Garey, David S. Johnson:
Computers and Intractability: A Guide to the Theory of NP-Completeness.
W. H. Freeman 1979, ISBN 0-7167-1044-7
BibTeX
- [15]
- Bezalel Gavish, Arie Segev:
Set Query Optimization in Distributed Database Systems.
ACM Trans. Database Syst. 11(3): 265-293(1986) BibTeX
- [16]
- Nathan Goodman, Oded Shmueli:
Tree Queries: A Simple Class of Relational Queries.
ACM Trans. Database Syst. 7(4): 653-677(1982) BibTeX
- [17]
- Won Kim:
A New Way to Compute the Product and Join of Relations.
SIGMOD Conference 1980: 179-187 BibTeX
- [18]
- ...
- [19]
- Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka:
Application of Hash to Data Base Machine and Its Architecture.
New Generation Comput. 1(1): 63-74(1983) BibTeX
- [20]
- ...
- [21]
- Richard J. Lipton, Jeffrey F. Naughton:
Query Size Estimation by Adaptive Sampling.
PODS 1990: 40-46 BibTeX
- [22]
- Sakti Pramanik, David Vineyard:
Optimizing Join Queries in Distributed Databases.
IEEE Trans. Software Eng. 14(9): 1319-1326(1988) BibTeX
- [23]
- ...
- [24]
- Giovanni Maria Sacco:
Fragmentation: A Technique for Efficient Query Processing.
ACM Trans. Database Syst. 11(2): 113-133(1986) BibTeX
- [25]
- Arie Segev:
Optimization of Join Operations in Horizontally Partitioned Database Systems.
ACM Trans. Database Syst. 11(1): 48-80(1986) BibTeX
- [26]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34 BibTeX
- [27]
- Leonard D. Shapiro:
Join Processing in Database Systems with Large Main Memories.
ACM Trans. Database Syst. 11(3): 239-264(1986) BibTeX
- [28]
- ...
- [29]
- ...
- [30]
- ...
- [31]
- Michael Stonebraker, Erich J. Neuhold:
A Distributed Database Version of INGRES.
Berkeley Workshop 1977: 19-36 BibTeX
- [32]
- Wei Sun, Weiyi Meng, Clement T. Yu:
Query Optimization in Object-Oriented Database Systems.
DEXA 1990: 215-222 BibTeX
- [33]
- ...
- [34]
- Patrick Valduriez, Georges Gardarin:
Join and Semijoin Algorithms for a Multiprocessor Database Machine.
ACM Trans. Database Syst. 9(1): 133-161(1984) BibTeX
- [35]
- R. Williams, Dean Daniels, Laura M. Haas, George Lapis, Bruce G. Lindsay, Pui Ng, Ron Obermarck, Patricia G. Selinger, Adrian Walker, Paul F. Wilms, Robert A. Yost:
R*: An Overview of the Architecture.
JCDKB 1982: 1-27 BibTeX
- [36]
- Eugene Wong:
Dynamic Rematerialization: Processing Distributed Queries Using Redundant Data.
IEEE Trans. Software Eng. 9(3): 228-232(1983) BibTeX
- [37]
- ...
- [38]
- Clement T. Yu, Keh-Chang Guh, Weining Zhang, Marjorie Templeton, David Brill, Arbee L. P. Chen:
Algorithms to Process Distributed Queries in Fast Local Networks.
IEEE Trans. Computers 36(10): 1153-1164(1987) BibTeX
Referenced by
- John L. Pfaltz, Russell F. Haddleton, James C. French:
Scalable, Parallel, Scientific Databases.
SSDBM 1998: 4-11
- Sha Guo, Wei Sun, Mark Allen Weiss:
Solving Satisfiability and Implication Problems in Database Systems.
ACM Trans. Database Syst. 21(2): 270-293(1996)
- Chihping Wang, Ming-Syan Chen:
On the Complexity of Distributed Query Optimization.
IEEE Trans. Knowl. Data Eng. 8(4): 650-662(1996)
- Chengwen Liu, Hao Chen, Warren Krueger:
A Distributed Query Processing Strategy Using Placement Dependency.
ICDE 1996: 477-484
- Chengwen Liu, Hao Chen:
A Hash Partition Strategy for Distributed Query Processing.
EDBT 1996: 373-387
- Waqar Hasan, Rajeev Motwani:
Coloring Away Communication in Parallel Query Optimization.
VLDB 1995: 239-250
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:39:10 2008