ACM SIGMOD Anthology TODS dblp.uni-trier.de

A Parallel Algorithm for Record Clustering.

Edward Omiecinski, Peter Scheuermann: A Parallel Algorithm for Record Clustering. ACM Trans. Database Syst. 15(4): 599-624(1990)
@article{DBLP:journals/tods/OmiecinskiS90,
  author    = {Edward Omiecinski and
               Peter Scheuermann},
  title     = {A Parallel Algorithm for Record Clustering},
  journal   = {ACM Trans. Database Syst.},
  volume    = {15},
  number    = {4},
  year      = {1990},
  pages     = {599-624},
  ee        = {http://doi.acm.org/10.1145/99935.99947, db/journals/tods/OmiecinskiS90.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We present an efficient heuristic algorithm for record clustering that can run on a SIMD machine. We introduce the P-tree, and its associated numbering scheme, which in the split phase allows each processor independently to compute the unique cluster number of a record satisfying an arbitrary query. We show that by restricting ourselves in the merge phase to combining only sibling clusters, we obtain a parallel algorithm whose speedup ratio is optimal in the number of processors used. Finally, we report on experiments showing that our method produces substantial savings in an environment with relatively little overlap among the queries.

Copyright © 1990 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Joint ACM SIGMOD / IEEE Computer Society Anthology

CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman: Data Structures and Algorithms. Addison-Wesley 1983, ISBN 0-201-00023-7
BibTeX
[2]
Jayanta Banerjee, David K. Hsiao, Richard I. Baum: Concepts and Capabilities of a Database Computer. ACM Trans. Database Syst. 3(4): 347-384(1978) BibTeX
[3]
Gérard M. Baudet, David Stevenson: Optimal Sorting Algorithms for Parallel Computers. IEEE Trans. Computers 27(1): 84-87(1978) BibTeX
[4]
Dina Bitton, David J. DeWitt, David K. Hsiao, Jai Menon: A Taxonomy of Parallel Sorting. ACM Comput. Surv. 16(3): 287-318(1984) BibTeX
[5]
Dina Bitton, Haran Boral, David J. DeWitt, W. Kevin Wilkinson: Parallel Algorithms for the Execution of Relational Database Operations. ACM Trans. Database Syst. 8(3): 324-353(1983) BibTeX
[6]
David Hung-Chang Du, J. S. Sobolewski: Disk Allocation for Cartesian Product Files on Multiple-Disk Systems. ACM Trans. Database Syst. 7(1): 82-101(1982) BibTeX
[7]
...
[8]
Daniel S. Hirschberg: Fast Parallel Sorting Algorithms. Commun. ACM 21(8): 657-661(1978) BibTeX
[9]
Matti Jakobsson: Reducing block accesses in inverted files by partial clustering. Inf. Syst. 5(1): 1-5(1980) BibTeX
[10]
J. H. Liou, S. Bing Yao: Multi-dimensional clustering for data base organizations. Inf. Syst. 2(4): 187-198(1977) BibTeX
[11]
Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik: The Grid File: An Adaptable, Symmetric Multikey File Structure. ACM Trans. Database Syst. 9(1): 38-71(1984) BibTeX
[12]
...
[13]
Edward Omiecinski: Incremental File Reorganization Schemes. VLDB 1985: 346-357 BibTeX
[14]
...
[15]
Jack A. Orenstein, T. H. Merrett, Luc Devroye: Linear Sorting with O(log n) Processors. BIT 23(2): 170-180(1983) BibTeX
[16]
Aris M. Ouksel, Peter Scheuermann: Storage Mappings for Multidimensional Linear Dynamic Hashing. PODS 1983: 90-105 BibTeX
[17]
Michael J. Quinn, Narsingh Deo: Parallel Graph Algorithms. ACM Comput. Surv. 16(3): 319-348(1984) BibTeX
[18]
Carla D. Savage, Joseph JáJá: Fast, Efficient Parallel Algorithms for Some Graph Problems. SIAM J. Comput. 10(4): 682-691(1981) BibTeX
[19]
Peter Scheuermann, Aris M. Ouksel: Multidimensional B-trees for associative searching in database systems. Inf. Syst. 7(2): 123-137(1982) BibTeX
[20]
...
[21]
Patrick Valduriez, Georges Gardarin: Multiprocessor Join Algorithms of Relations. JCDKB 1982: 219-236 BibTeX
[22]
S. Bing Yao: Approximating the Number of Accesses in Database Organizations. Commun. ACM 20(4): 260-261(1977) BibTeX
[23]
...
[24]
Clement T. Yu, Cheing-Mei Suen, K. Lam, M. K. Siu: Adaptive Record Clustering. ACM Trans. Database Syst. 10(2): 180-204(1985) BibTeX

Referenced by

  1. Kiran J. Achyutuni, Edward Omiecinski, Shamkant B. Navathe: Two Techniques for On-Line Index Modification in Shared Nothing Parallel Databases. SIGMOD Conference 1996: 125-136
  2. Gerhard Weikum, Peter Zabback, Peter Scheuermann: Dynamic File Allocation in Disk Arrays. SIGMOD Conference 1991: 406-415
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:39:09 2008