ACM SIGMOD Anthology TODS dblp.uni-trier.de

Disk Allocation for Cartesian Product Files on Multiple-Disk Systems.

David Hung-Chang Du, J. S. Sobolewski: Disk Allocation for Cartesian Product Files on Multiple-Disk Systems. ACM Trans. Database Syst. 7(1): 82-101(1982)
@article{DBLP:journals/tods/DuS82,
  author    = {David Hung-Chang Du and
               J. S. Sobolewski},
  title     = {Disk Allocation for Cartesian Product Files on Multiple-Disk
               Systems},
  journal   = {ACM Trans. Database Syst.},
  volume    = {7},
  number    = {1},
  year      = {1982},
  pages     = {82-101},
  ee        = {http://doi.acm.org/10.1145/319682.319698, db/journals/tods/DuS82.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Cartesian product files have recently been shown to exhibit attractive properties for partial match queries. This paper considers the file allocation problem for Cartesian product files, which can be stated as follows: Given a k-attribute Cartesian product file and an m-disk system, allocate buckets among the m disks in such a way that, for all possible partial match queries, the concurrency of disk accesses is maximized. The Risk Modulo (DM) allocation method is described first, and it is shown to be strict optimal under many conditions commonly occurring in practice, including all possible partial match queries when the number of disks is 2 or 3. It is also shown that although it has good performance, the DM allocation method is not strict optimal for all possible partial match queries when the number of disks is greater than 3. The General Disk Modulo (GDM) allocation method is then described, and a sufficient but not necessary condition for strict optimality of the GDM method for all partial match queries and any number of disks is then derived. Simulation studies comparing the DM and random allocation methods in terms of the average number of disk accesses, in response to various classes of partial match queries, show the former to be significantly more effective even when the number of disks is greater than 3, that is, even in cases where the DM method is not strict optimal. The results that have been derived formally and shown by simulation can be used for more effective design of optimal file systems for partial match queries. When considering multiple-disk systems with independent access paths, it is important to ensure that similar records are clustered into the same or similar buckets, while similar buckets should be dispersed uniformly among the disks.

Copyright © 1982 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Joint ACM SIGMOD / IEEE Computer Society Anthology

CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Alfred V. Aho, Jeffrey D. Ullman: Optimal Partial-Match Retrieval When Fields Are Independently Specified. ACM Trans. Database Syst. 4(2): 168-179(1979) BibTeX
[2]
Walter A. Burkhard: Partial-Match Hash Coding: Benefits of Redundancy. ACM Trans. Database Syst. 4(2): 228-239(1979) BibTeX
[3]
Chin-Chen Chang, Richard C. T. Lee, David Hung-Chang Du: Some Properties of Cartesian Product Files. SIGMOD Conference 1980: 157-168 BibTeX
[4]
W. C. Lin, Richard C. T. Lee, David Hung-Chang Du: Common Properties of Some Multiattribute File Systems. IEEE Trans. Software Eng. 5(2): 160-174(1979) BibTeX
[5]
J. H. Liou, S. Bing Yao: Multi-dimensional clustering for data base organizations. Inf. Syst. 2(4): 187-198(1977) BibTeX
[6]
Ronald L. Rivest: Partial-Match Retrieval Algorithms. SIAM J. Comput. 5(1): 19-50(1976) BibTeX
[7]
James B. Rothnie Jr., Tomas Lozano: Attribute Based File Organization in a Paged Memory Environment. Commun. ACM 17(2): 63-69(1974) BibTeX

Referenced by

  1. Rakesh K. Sinha, Randeep Bhatia, Chung-Min Chen: Asymptotically Optimal Declustering Schemes for Range Queries. ICDT 2001: 144-158
  2. Mikhail J. Atallah, Sunil Prabhakar: (Almost) Optimal Parallel Block Access for Range Queries. PODS 2000: 205-215
  3. Randeep Bhatia, Rakesh K. Sinha, Chung-Min Chen: Hierarchical Declustering Schemes for Range Queries. EDBT 2000: 525-537
  4. Hakan Ferhatosmanoglu, Divyakant Agrawal, Amr El Abbadi: Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching. ICDE 1999: 608-615
  5. Chung-Min Chen, Rakesh K. Sinha: Raster-Spatial Data Declustering Revisited: An Interactive Navigation Perspective. ICDE 1999: 600-607
  6. Peter Scheuermann, Gerhard Weikum, Peter Zabback: Data Partitioning and Load Balancing in Parallel Disk Systems. VLDB J. 7(1): 48-66(1998)
  7. Shashi Shekhar, Sivakumar Ravada, Vipin Kumar, Douglas Chubb, Greg Turner: Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems. IEEE Trans. Knowl. Data Eng. 10(4): 632-655(1998)
  8. Bongki Moon, Joel H. Saltz: Scalability Analysis of Declustering Methods for Multidimensional Range Queries. IEEE Trans. Knowl. Data Eng. 10(2): 310-327(1998)
  9. Szu-Wen Kuo, Marianne Winslett, Ying Chen, Yong Cho: Efficient I/O of Grid Hierarchies for AMR Computations on Parallel Disks. SSDBM 1998: 12-21
  10. Sunil Prabhakar, Khaled A. S. Abdel-Ghaffar, Divyakant Agrawal, Amr El Abbadi: Cyclic Allocation of Two-Dimensional Data. ICDE 1998: 94-101
  11. C. Y. Chen, H. F. Lin, Chin-Chen Chang, Richard C. T. Lee: Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval. IEEE Trans. Knowl. Data Eng. 9(1): 148-160(1997)
  12. Stefan Berchtold, Christian Böhm, Bernhard Braunmüller, Daniel A. Keim, Hans-Peter Kriegel: Fast Parallel Similarity Search in Multimedia Databases. SIGMOD Conference 1997: 1-12
  13. Khaled A. S. Abdel-Ghaffar, Amr El Abbadi: Optimal Allocation of Two-Dimensional Data. ICDT 1997: 409-418
  14. Chialin Chang, Bongki Moon, Anurag Acharya, Carter Shock, Alan Sussman, Joel H. Saltz: Titan: A High-Performance Remote Sensing Database. ICDE 1997: 375-384
  15. Paolo Ciaccia, Paolo Tiberio, Pavel Zezula: Declustering of Key-Based Partitioned Signature Files. ACM Trans. Database Syst. 21(3): 295-338(1996)
  16. Peter Muth, Achim Kraiss, Gerhard Weikum: LoT: Dynamic Declustering of TSB-Tree Nodes for Parallel Access to Temporal Data. EDBT 1996: 553-572
  17. Nick Koudas, Christos Faloutsos, Ibrahim Kamel: Declustering Spatial Databases on a Multi-Computer Architecture. EDBT 1996: 592-614
  18. Nick Bassiliades, Ioannis P. Vlahavas: A Non-Uniform Data Fragmentation Strategy for Parallel Main-Menory Database Systems. VLDB 1995: 370-381
  19. Gerhard Weikum: Tutorial on Parallel Database Systems. ICDT 1995: 33-37
  20. Duen-Ren Liu, Shashi Shekhar: A Similarity Graph-Based Approach to Declustering Problems and Its Application towards Paralleling Grid Files. ICDE 1995: 373-381
  21. Jaideep Srivastava, Thomas M. Niccum, Bhaskar Himatsingka: Data Declustering in PADMA: A PArallel Database MAnager. IEEE Data Eng. Bull. 17(3): 3-13(1994)
  22. Yvonne Zhou, Shashi Shekhar, Mark Coyle: Disk Allocation Methods for Parallelizing Grid Files. ICDE 1994: 243-252
  23. Bhaskar Himatsingka, Jaideep Srivastava: Performance Evaluation of Grid Based Multi-Attibute Record Declustering Methods. ICDE 1994: 356-365
  24. Khaled A. S. Abdel-Ghaffar, Amr El Abbadi: Optimal Disk Allocation for Partial Match Queries. ACM Trans. Database Syst. 18(1): 132-156(1993)
  25. Ling Tony Chen, Doron Rotem: Declustering Objects for Visualization. VLDB 1993: 85-96
  26. Jianzhong Li, Jaideep Srivastava, Doron Rotem: CMD: A Multidimensional Declustering Method for Parallel Data Systems. VLDB 1992: 3-14
  27. Edward Omiecinski, Peter Scheuermann: A Parallel Algorithm for Record Clustering. ACM Trans. Database Syst. 15(4): 599-624(1990)
  28. Shahram Ghandeharizadeh, David J. DeWitt: Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines. VLDB 1990: 481-492
  29. Khaled A. S. Abdel-Ghaffar, Amr El Abbadi: On the Optimality of Disk Allocation for Cartesian Product Files. PODS 1990: 258-264
  30. Christos Faloutsos, Dimitris N. Metaxas: Declustering Using Error Correcting Codes. PODS 1989: 253-258
  31. Myoung-Ho Kim, Sakti Pramanik: Optimal File Distribution For Partial Match Retrieval. SIGMOD Conference 1988: 173-182
  32. C. Thomas Wu, Walter A. Burkhard: Associative Searching in Multiple Storage Units. ACM Trans. Database Syst. 12(1): 38-64(1987)
  33. M. T. Fang, Richard C. T. Lee, Chin-Chen Chang: The Idea of De-Clustering and its Applications. VLDB 1986: 181-188
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:38:48 2008