ACM SIGMOD Anthology VLDB dblp.uni-trier.de

An Adaptive Data Placement Scheme for Parallel Database Computer Systems.

Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506
@inproceedings{DBLP:conf/vldb/HuaL90,
  author    = {Kien A. Hua and
               Chiang Lee},
  editor    = {Dennis McLeod and
               Ron Sacks-Davis and
               Hans-J{\"o}rg Schek},
  title     = {An Adaptive Data Placement Scheme for Parallel Database Computer
               Systems},
  booktitle = {16th International Conference on Very Large Data Bases, August
               13-16, 1990, Brisbane, Queensland, Australia, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1990},
  isbn      = {1-55860-149-X},
  pages     = {493-506},
  ee        = {db/conf/vldb/HuaL90.html},
  crossref  = {DBLP:conf/vldb/90},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The capacity and performance of database management system (DBMS) using a conventional (von Newmann-type) computer are limited by the total I/O channel bandwidth, the aggregate processing power, and the amount of main memory. With the advent of micro-processor, memory, and communication technology, it is econominally feasible to develop a parallel database computer system. The parallel processing techniques are employed to utilize the available resources in a coordinated fashion to solve the DBMS capacity and performance problems. Relations in such an environment are declustered into fragments and spreaded across computers. To achieve the optimal performance in data processing, it is essential for each computer to have a perfectly balanced load (i.e., identical amount of data). However, fragment sizes may vary due to insertions to and deletions from a relation. To retain good performance, the system needs to periodically rebalance data loads among the computers.

In this paper, we present an adaptive data placement scheme which balances computer work loads during query processing. The entire scheme is built on top of the popular grid file structure (but not limited to grid file). The adaptivity of the scheme and its relevant features are discussed. The cost of load rebalancing is estimated. The result shows that under our assumptions, it is always beneficial to perform load rebalancing before performing a join on skewed data.

Copyright © 1990 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Dennis McLeod, Ron Sacks-Davis, Hans-Jörg Schek (Eds.): 16th International Conference on Very Large Data Bases, August 13-16, 1990, Brisbane, Queensland, Australia, Proceedings. Morgan Kaufmann 1990, ISBN 1-55860-149-X
BibTeX

References

[Ben79]
Jon Louis Bentley: Multidimensional Binary Search Trees in Database Applications. IEEE Trans. Software Eng. 5(4): 333-340(1979) BibTeX
[Bhi88]
Anupam Bhide: An Analysis of Three Transaction Processing Architectures. VLDB 1988: 339-350 BibTeX
[Cop88]
George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108 BibTeX
[Dew84]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 BibTeX
[Dew86]
David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna: GAMMA - A High Performance Dataflow Database Machine. VLDB 1986: 228-237 BibTeX
[Kit84]
...
[Kit89]
Masaru Kitsuregawa, Lilian Harada, Mikio Takagi: Join Strategies on KB-Tree Indexed Relations. ICDE 1989: 85-93 BibTeX
[Kri88]
Hans-Peter Kriegel, Bernhard Seeger: PLOP-Hashing: A Grid File without Directory. ICDE 1988: 369-376 BibTeX
[Lak88]
M. Seetha Lakshmi, Philip S. Yu: Effect of Skew on Join Performance in Parallel Architectures. DPDS 1988: 107-120 BibTeX
[Nie84]
Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik: The Grid File: An Adaptable, Symmetric Multikey File Structure. ACM Trans. Database Syst. 9(1): 38-71(1984) BibTeX
[Ozk88]
Esen A. Ozkarahan, Cem H. Bozsahin: Join Strategies Using Data Space Partitioning. New Generation Comput. 6(1): 19-39(1988) BibTeX
[Soc79]
Gary H. Sockut, Robert P. Goldberg: Database Reorganization - Principles and Practice. ACM Comput. Surv. 11(4): 371-395(1979) BibTeX
[Su83]
...
[Ter88]
...

Referenced by

  1. Manish Mehta, David J. DeWitt: Data Placement in Shared-Nothing Parallel Database Systems. VLDB J. 6(1): 53-72(1997)
  2. Anant Jhingran, Timothy Malkemus, Sriram Padmanabhan: Query Optimization in DB2 Parallel Edition. IEEE Data Eng. Bull. 20(2): 27-34(1997)
  3. Chiang Lee, Chi-Sheng Shih, Yaw-Huei Chen: A Graph-Theoretic Model for Optimizing Large Join Queries. DASFAA 1997: 87-96
  4. Chiang Lee, Zue-An Chang: Utilizing Page-Level Join Index for Optimization in Parallel Join Execution. IEEE Trans. Knowl. Data Eng. 7(6): 900-914(1995)
  5. Kien A. Hua, Chiang Lee, Chau M. Hua: Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning. IEEE Trans. Knowl. Data Eng. 7(6): 968-983(1995)
  6. Manish Mehta, David J. DeWitt: Managing Intra-operator Parallelism in Parallel Database Systems. VLDB 1995: 382-394
  7. Gerhard Weikum: Tutorial on Parallel Database Systems. ICDT 1995: 33-37
  8. Kien A. Hua, Wallapak Tavanapong, Honesty C. Young: A Performance Evaluation of Load Balancing Techniques for Join Operations on Multicomputer Database Systems. ICDE 1995: 44-51
  9. Won S. Lee, Phillip C.-Y. Sheu: An Object-Oriented Query Evaluation Scheme for Logical Databases in Massively Parallel Environment. IEEE Trans. Knowl. Data Eng. 6(1): 181-187(1994)
  10. Kien A. Hua, Yu-lung Lo, Honesty C. Young: Considering Data Skew Factor in Multi-Way Join Query Optimization for Parallel Execution. VLDB J. 2(3): 303-330(1993)
  11. Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993)
  12. Jianzhong Li, Doron Rotem, Jaideep Srivastava: Algorithms for Loading Parallel Grid Files. SIGMOD Conference 1993: 347-356
  13. Chiang Lee, Zue-An Chang: Workload Balance and Page Access Scheduling For Parallel Joins In Shared-Nothing Systems. ICDE 1993: 411-418
  14. Priti Mishra, Margaret H. Eich: Join Processing in Relational Databases. ACM Comput. Surv. 24(1): 63-113(1992)
  15. Jianzhong Li, Jaideep Srivastava, Doron Rotem: CMD: A Multidimensional Declustering Method for Parallel Data Systems. VLDB 1992: 3-14
  16. Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535
  17. Bernhard Seeger, Per-Åke Larson: Multi-Disk B-trees. SIGMOD Conference 1991: 436-445
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:45 2009