ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

External Memory Algorithms.

Jeffrey Scott Vitter: External Memory Algorithms. PODS 1998: 119-128
@inproceedings{DBLP:conf/pods/Vitter98,
  author    = {Jeffrey Scott Vitter},
  title     = {External Memory Algorithms},
  booktitle = {Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, June 1-3, 1998, Seattle, Washington},
  publisher = {ACM Press},
  year      = {1998},
  isbn      = {0-89791-996-3},
  pages     = {119-128},
  ee        = {http://doi.acm.org/10.1145/275487.275501, db/conf/pods/Vitter98.html},
  crossref  = {DBLP:conf/pods/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between the fast internal memory and the slower external memory (such as disks) can be a major performance bottleneck. In this tutorial we survey the state of the art in the design and analysis of external memory algorithms (also known as out-of-core algorithms or I/O algorithms). External memory algorithms are often designed using the parallel disk model (PDM). Its three primary machine-independent measures of an algorithm's performance are the number of I/O operations performed, the CPU time, and the amount of disk space used. The PDM allows for multiple disks (or disk arrays) and parallel CPUs, and it can be generalized to handle cache hierarchies, hierarchical memory, and tertiary storage.

We discuss a variety of problems and identify a set of paradigms for solving them efficiently in external memory. Programming tools and environments are available for simplifying the programming task. We present experiments involving some newly developed algorithms for spatial databases, implemented using the TPIE programming environment. The empirical timings show a significant speedup attained by the new algorithms over currently used methods.

Copyright © 1998 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1-3, 1998, Seattle, Washington. ACM Press 1998, ISBN 0-89791-996-3
Contents BibTeX

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1637 KB]

References

[1]
...
[2]
Pankaj K. Agarwal, Lars Arge, Jeff Erickson, Paolo Giulio Franciosa, Jeffrey Scott Vitter: Efficient Searching with Linear Constraints. PODS 1998: 169-178 BibTeX
[3]
...
[4]
...
[5]
...
[6]
Alok Aggarwal, Jeffrey Scott Vitter: The Input/Output Complexity of Sorting and Related Problems. Commun. ACM 31(9): 1116-1127(1988) BibTeX
[7]
Lars Arge: The Buffer Tree: A New Technique for Optimal I/O-Algorithms (Extended Abstract). WADS 1995: 334-345 BibTeX
[8]
...
[9]
...
[10]
...
[11]
...
[12]
Lars Arge, Mikael Knudsen, Kirsten Larsen: A General Lower Bound on the I/O-Complexity of Comparison-based Algorithms. WADS 1993: 83-94 BibTeX
[13]
...
[14]
...
[15]
...
[16]
...
[17]
...
[18]
...
[19]
...
[20]
Rakesh D. Barve, Edward F. Grove, Jeffrey Scott Vitter: Simple Randomized Mergesort on Parallel Disks. Parallel Computing 23(4-5): 601-631(1997) BibTeX
[21]
Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indices. Acta Inf. 1: 173-189(1972) BibTeX
[22]
Bruno Becker, Stephan Gschwind, Thomas Ohler, Bernhard Seeger, Peter Widmayer: An Asymptotically Optimal Multiversion B-Tree. VLDB J. 5(4): 264-275(1996) BibTeX
[23]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331 BibTeX
[24]
Stefan Berchtold, Christian Böhm, Hans-Peter Kriegel: Improving the Query Performance of High-Dimensional Index Structures by Bulk-Load Operations. EDBT 1998: 216-230 BibTeX
[25]
...
[26]
Paul B. Callahan, Michael T. Goodrich, Kumar Ramaiyer: Topology B-Trees and Their Applications. WADS 1995: 381-392 BibTeX
[27]
Peter M. Chen, Edward L. Lee, Garth A. Gibson, Randy H. Katz, David A. Patterson: RAID: High-Performance, Reliable Secondary Storage. ACM Comput. Surv. 26(2): 145-185(1994) BibTeX
[28]
...
[29]
Yi-Jen Chiang: Experiments on the Practical I/O Efficiency of Geometric Algorithms: Distribution Sweep vs. Plane Sweep. WADS 1995: 346-357 BibTeX
[30]
...
[31]
...
[32]
...
[33]
...
[34]
...
[35]
Robert Cypher, C. Greg Plaxton: Deterministic Sorting in Nearly Logarithmic Time on the Hypercube and Related Computers. J. Comput. Syst. Sci. 47(3): 501-548(1993) BibTeX
[36]
...
[37]
...
[38]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider: Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting. PDIS 1991: 280-291 BibTeX
[39]
James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator, Robert Endre Tarjan: Making Data Structures Persistent. J. Comput. Syst. Sci. 38(1): 86-124(1989) BibTeX
[40]
...
[41]
...
[42]
Paolo Ferragina, Roberto Grossi: A fully-dynamic data structure for external substring search (Extended Abstract). STOC 1995: 693-702 BibTeX
[43]
...
[44]
...
[45]
Garth A. Gibson, Jeffrey Scott Vitter, John Wilkes: Strategic Directions in Storage I/O Issues in Large-Scale Computing. ACM Comput. Surv. 28(4): 779-793(1996) BibTeX
[46]
Michael T. Goodrich, Jyh-Jong Tsay, Darren Erik Vengroff, Jeffrey Scott Vitter: External-Memory Computational Geometry (Preliminary Version). FOCS 1993: 714-723 BibTeX
[47]
Diane Greene: An Implementation and Performance Analysis of Spatial Data Access Methods. ICDE 1989: 606-615 BibTeX
[48]
Roberto Grossi, Giuseppe F. Italiano: Efficient Splitting and Merging Algorithms for Order Decomposable Problems (Extended Abstract). ICALP 1997: 605-615 BibTeX
[49]
...
[50]
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57 BibTeX
[51]
Joseph M. Hellerstein, Elias Koutsoupias, Christos H. Papadimitriou: On the Analysis of Indexing Schemes. PODS 1997: 249-256 BibTeX
[52]
Lisa Hellerstein, Garth A. Gibson, Richard M. Karp, Randy H. Katz, David A. Patterson: Coding Techniques for Handling Failures in Large Disk Arrays. Algorithmica 12(2/3): 182-208(1994) BibTeX
[53]
Jia-Wei Hong, H. T. Kung: I/O Complexity: The Red-Blue Pebble Game. STOC 1981: 326-333 BibTeX
[54]
...
[55]
Ibrahim Kamel, Christos Faloutsos: On Packing R-trees. CIKM 1993: 490-499 BibTeX
[56]
Ibrahim Kamel, Christos Faloutsos: Hilbert R-tree: An Improved R-tree using Fractals. VLDB 1994: 500-509 BibTeX
[57]
...
[58]
Paris C. Kanellakis, Gabriel M. Kuper, Peter Z. Revesz: Constraint Query Languages. PODS 1990: 299-313 BibTeX
[59]
Paris C. Kanellakis, Sridhar Ramaswamy, Darren Erik Vengroff, Jeffrey Scott Vitter: Indexing for Data Models with Constraints and Classes. PODS 1993: 233-243 BibTeX
[60]
David G. Kirkpatrick, Raimund Seidel: The Ultimate Planar Convex Hull Algorithm? SIAM J. Comput. 15(1): 287-299(1986) BibTeX
[61]
...
[62]
...
[63]
...
[64]
...
[65]
Charles E. Leiserson, Satish Rao, Sivan Toledo: Efficient Out-of-Core Algorithms for Linear Relaxation Using Blocking Covers (Extended Abstract). FOCS 1993: 704-713 BibTeX
[66]
...
[67]
...
[68]
David B. Lomet, Betty Salzberg: Concurrency and Recovery for Index Trees. VLDB J. 6(3): 224-240(1997) BibTeX
[69]
...
[70]
Mark H. Nodine, Michael T. Goodrich, Jeffrey Scott Vitter: Blocking for External Graph Searching. Algorithmica 16(2): 181-214(1996) BibTeX
[71]
Mark H. Nodine, Daniel P. Lopresti, Jeffrey Scott Vitter: I/O Overhead and Parallel VLSI Architectures for Lattice Computations. IEEE Trans. Computers 40(7): 843-852(1991) BibTeX
[72]
...
[73]
Mark H. Nodine, Jeffrey Scott Vitter: Greed Sort: Optimal Deterministic Sorting on Parallel Disks. J. ACM 42(4): 919-933(1995) BibTeX
[74]
HweeHwa Pang, Michael J. Carey, Miron Livny: Memory-Adaptive External Sorting. VLDB 1993: 618-629 BibTeX
[75]
Jignesh M. Patel, David J. DeWitt: Partition Based Spatial-Merge Join. SIGMOD Conference 1996: 259-270 BibTeX
[76]
Sridhar Ramaswamy, Sairam Subramanian: Path Caching: A Technique for Optimal External Searching. PODS 1994: 25-35 BibTeX
[77]
Chris Ruemmler, John Wilkes: An Introduction to Disk Drive Modeling. IEEE Computer 27(3): 17-28(1994) BibTeX
[78]
...
[79]
...
[80]
Hanan Samet: The Design and Analysis of Spatial Data Structures. Addison-Wesley 1990
BibTeX
[81]
...
[82]
Timos K. Sellis, Nick Roussopoulos, Christos Faloutsos: The R+-Tree: A Dynamic Index for Multi-Dimensional Objects. VLDB 1987: 507-518 BibTeX
[83]
...
[84]
...
[85]
...
[86]
...
[87]
Roberto Tamassia, Jeffrey Scott Vitter: Optimal Cooperative Search in Fractional Cascaded Data Structures. Algorithmica 15(2): 154-171(1996) BibTeX
[88]
...
[89]
...
[90]
Jochen Van den Bercken, Bernhard Seeger, Peter Widmayer: A Generic Approach to Bulk Loading Multidimensional Index Structures. VLDB 1997: 406-415 BibTeX
[91]
...
[92]
Peter J. Varman, Rakesh M. Verma: An Efficient Multiversion Access STructure. IEEE Trans. Knowl. Data Eng. 9(3): 391-409(1997) BibTeX
[93]
...
[94]
...
[95]
...
[96]
...
[97]
Jeffrey Scott Vitter, Elizabeth A. M. Shriver: Algorithms for Parallel Memory I: Two-Level Memories. Algorithmica 12(2/3): 110-147(1994) BibTeX
[98]
Jeffrey Scott Vitter, Elizabeth A. M. Shriver: Algorithms for Parallel Memory II: Hierarchical Multilevel Memories. Algorithmica 12(2/3): 148-169(1994) BibTeX
[99]
...
[100]
...
[101]
...
[102]
Weiye Zhang, Per-Åke Larson: Dynamic Memory Adjustment for External Mergesort. VLDB 1997: 376-385 BibTeX

Referenced by

  1. Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambrusch, Andrew Rau-Chaplin: Parallelizing the Data Cube. ICDT 2001: 129-143
  2. Sudarshan S. Chawathe: Comparing Hierarchical Data in External Memory. VLDB 1999: 90-101
  3. Hongjun Zhu, Jianwen Su, Oscar H. Ibarra: An Index Structure for Spatial Joins in Linear Constraint Databases. ICDE 1999: 636-643
  4. Yossi Matias, Jeffrey Scott Vitter, Min Wang: Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference 1998: 448-459
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:34:19 2009