ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Data Placement in Shared-Nothing Parallel Database Systems.

Manish Mehta, David J. DeWitt: Data Placement in Shared-Nothing Parallel Database Systems. VLDB J. 6(1): 53-72(1997)
@article{DBLP:journals/vldb/MehtaD97,
  author    = {Manish Mehta 0002 and
               David J. DeWitt},
  title     = {Data Placement in Shared-Nothing Parallel Database Systems},
  journal   = {VLDB J.},
  volume    = {6},
  number    = {1},
  year      = {1997},
  pages     = {53-72},
  ee        = {db/journals/vldb/MehtaD97.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Data placement in shared-nothing database systems has been studied extensively in the past and various placement algorithms have been proposed. However, there is no consensus on the most efficient data placement algorithm and placement is still performed manually by a database administrator with periodic reorganization to correct mistakes. This paper presents the first comprehensive simulation study of data placement issues in a shared-nothing system. The results show that current hardware technology trends have significantly changed the performance tradeoffs considered in past studies. A simplistic data placement strategy based on the new results is developed and shown to perform well for a variety of workloads.

Key Words

Declustering, Disk allocation, Resource allocation, Resource scheduling

Copyright © 1997 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[Bitt88]
Dina Bitton, Jim Gray: Disk Shadowing. VLDB 1988: 331-338 BibTeX
[Bora90]
Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990) BibTeX
[Brow92]
...
[Brow93]
Kurt P. Brown, Michael J. Carey, Miron Livny: Managing Memory to Meet Multiclass Workload Response Time Goals. VLDB 1993: 328-341 BibTeX
[Brow94]
Kurt P. Brown, Manish Mehta, Michael J. Carey, Miron Livny: Towards Automated Performance Tuning for Complex Workloads. VLDB 1994: 72-84 BibTeX
[Ceri84]
Stefano Ceri, Giuseppe Pelagatti: Distributed Databases: Principles and Systems. McGraw-Hill Book Company 1984, ISBN 0-07-010829-3
BibTeX
[Chen92a]
Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins. VLDB 1992: 15-26 BibTeX
[Chen 92b]
Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries. ICDE 1992: 58-67 BibTeX
[Cope85]
George P. Copeland, Setrag Khoshafian: A Decomposition Storage Model. SIGMOD Conference 1985: 268-279 BibTeX
[Cope88]
George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108 BibTeX
[DeWi84]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[DeWi90]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) BibTeX
[DeWi92a]
David J. DeWitt, Jim Gray: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6): 85-98(1992) BibTeX
[DeWi92b]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 BibTeX
[Dowd82]
Lawrence W. Dowdy, Derrell V. Foster: Comparative Models of the File Assignment Problem. ACM Comput. Surv. 14(2): 287-313(1982) BibTeX
[Engl91]
...
[Falo93]
Christos Faloutsos, Pravin Bhagwat: Declustering Using Fractals. PDIS 1993: 18-25 BibTeX
[Gerb85]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 BibTeX
[Ghan90]
...
[Ghan92]
Shahram Ghandeharizadeh, David J. DeWitt, Waheed Qureshi: A Performance Analysis of Alternative Multi-Attribute Declustering Strategies. SIGMOD Conference 1992: 29-38 BibTeX
[Grae89]
...
[Gray87]
Jim Gray, Gianfranco R. Putzolu: The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD Conference 1987: 395-398 BibTeX
[Haas90]
Laura M. Haas, Walter Chang, Guy M. Lohman, John McPherson, Paul F. Wilms, George Lapis, Bruce G. Lindsay, Hamid Pirahesh, Michael J. Carey, Eugene J. Shekita: Starburst Mid-Flight: As the Dust Clears. IEEE Trans. Knowl. Data Eng. 2(1): 143-160(1990) BibTeX
[Hua90]
Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506 BibTeX
[Hua91]
Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535 BibTeX
[IBM93]
...
[Kits91]
Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221 BibTeX
[Livn87]
Miron Livny, Setrag Khoshafian, Haran Boral: Multi-Disk Management Algorithms. SIGMETRICS 1987: 69-77 BibTeX
[Meht93]
Manish Mehta, David J. DeWitt: Dynamic Memory Allocation for Multiple-Query Workloads. VLDB 1993: 354-367 BibTeX
[Meht94]
...
[Nava89]
Shamkant B. Navathe, Minyoung Ra: Vertical Partitioning for Database Design: A Graphical Algorithm. SIGMOD Conference 1989: 440-450 BibTeX
[Ng91]
Raymond T. Ng, Christos Faloutsos, Timos K. Sellis: Flexible Buffer Allocation Based on Marginal Gains. SIGMOD Conference 1991: 387-396 BibTeX
[Omie91]
Edward Omiecinski: Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor. VLDB 1991: 375-385 BibTeX
[Oszu90]
M. Tamer Özsu, Patrick Valduriez: Principles of Distributed Database Systems. Prentice-Hall 1991, ISBN 0-13-715681-2
BibTeX
[Padm92]
...
[Para93]
...
[Rahm93a]
Erhard Rahm, Robert Marek: Analysis of Dynamic Load Balancing Strategies for Parallel Shared Nothing Database Systems. VLDB 1993: 182-193 BibTeX
[Rahm93b]
Erhard Rahm: Parallel Query Processing in Shared Disk Database Systems. HPTS 1993: 0- BibTeX
[Ries78]
...
[Schn90]
Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480 BibTeX
[Schw90]
...
[Seli93]
Patricia G. Selinger: Predictions and Challenges for Database Systems in the Year 2000. VLDB 1993: 667-675 BibTeX
[Sell88]
Timos K. Sellis: Multiple-Query Optimization. ACM Trans. Database Syst. 13(1): 23-52(1988) BibTeX
[Shat93]
Ambuj Shatdal, Jeffrey F. Naughton: Using Shared Virtual Memory for Parallel Join Processing. SIGMOD Conference 1993: 119-128 BibTeX
[Stel93]
...
[Tand88]
The Tandem Performance Group: A Benchmark of NonStop SQL on the Debit Credit Transaction (Invited Paper). SIGMOD Conference 1988: 337-341 BibTeX
[Tera85]
...
[Walt91]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 BibTeX
[Weik91]
Gerhard Weikum, Peter Zabback, Peter Scheuermann: Dynamic File Allocation in Disk Arrays. SIGMOD Conference 1991: 406-415 BibTeX
[Weik92]
Gerhard Weikum, Peter Zabback: Tuning of Striping Units in Disk-Array-Based File Systems. RIDE-TQP 1992: 80-87 BibTeX
[Wils92]
Annita N. Wilschut, Jan Flokstra, Peter M. G. Apers: Parallelism in a Main-Memory DBMS: The Performance of PRISMA/DB. VLDB 1992: 521-532 BibTeX
[Wolf89]
Joel L. Wolf: The Placement Optimization Program: A Practical Solution to the Disk File Assignment Problem. SIGMETRICS 1989: 1-10 BibTeX
[Wolf90]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209 BibTeX
[Youn92]
...
[Yu93]
Philip S. Yu, Douglas W. Cornell: Buffer Management Based on Return on Consumption in a Multi-Query Environment. VLDB J. 2(1): 1-37(1993) BibTeX
[Zipf49]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
BibTeX

Referenced by

  1. Uwe Röhm, Klemens Böhm, Hans-Jörg Schek: OLAP Query Routing and Physical Design in a Database Cluster. EDBT 2000: 254-268
  2. Lionel Brunie, Harald Kosch: ModParOpt: A Modular Query Optimizer for Multi-Query Parallel Databases. ADBIS 1997: 97-106
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:29 2009