ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Memory Management During Run Generation in External Sorting.

Per-Åke Larson, Goetz Graefe: Memory Management During Run Generation in External Sorting. SIGMOD Conference 1998: 472-483
@inproceedings{DBLP:conf/sigmod/LarsonG98,
  author    = {Per-{\AA}ke Larson and
               Goetz Graefe},
  editor    = {Laura M. Haas and
               Ashutosh Tiwary},
  title     = {Memory Management During Run Generation in External Sorting},
  booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
               on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
  publisher = {ACM Press},
  year      = {1998},
  isbn      = {0-89791-995-5},
  pages     = {472-483},
  ee        = {http://doi.acm.org/10.1145/276304.276346, db/conf/sigmod/LarsonG98.html},
  crossref  = {DBLP:conf/sigmod/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

If replacement selection is used in an external mergesort to generate initial runs, individual records are deleted and inserted in the sort operation's workspace. Variable-length records introduce the need for possibly complex memory management and extra copying of records. As a result, few systems employ replacement selection, even though it produces longer runs than commonly used algorithms. We experimentally compared several algorithms and variants for managing this workspace. We found that the simple best fit algorithm achieves memory utilization of 90% or better and run lengths over 1.8 times workspace size, with no extra copying of records and very little other overhead, for widely varying record sizes and for a wide range of memory sizes. Thus, replacement selection is a viable algorithm for commercial database systems, even for variable-length records.

Efficient memory management also enables an external sort algorithm that degrades gracefully when its input is only slightly larger than or a small multiple of the available memory size. This is not the case with the usual implementations of external sorting, which incur I/O for the entire input even if it is as little as one record larger than memory. Thus, in some cases, our techniques may reduce I/O volume by a factor 10 compared to traditional database sort algorithms. Moreover, the gradual rather than abrupt growth in I/O volume for increasing input sizes significantly eases design and implementation of intra-query memory management policies.

Copyright © 1998 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ... Online Version (ACM WWW Account required): Full Text in PDF Format

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Laura M. Haas, Ashutosh Tiwary (Eds.): SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA. ACM Press 1998, ISBN 0-89791-995-5 BibTeX , SIGMOD Record 27(2), June 1998
Contents

Online Edition: ACM SIGMOD

[Abstract]
[Full Text (Postscript)]

References

[1]
Vladimir Estivill-Castro, Derick Wood: A Survey of Adaptive Sorting Algorithms. ACM Comput. Surv. 24(4): 441-476(1992) BibTeX
[2]
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993) BibTeX
[3]
Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417 BibTeX
[4]
Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478 BibTeX
[5]
Donald E. Knuth: The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition. Addison-Wesley 1973
BibTeX
[6]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[7]
...
[8]
Vinay S. Pai, Peter J. Varman: Prefetching with Multiple Disks for External Mergesort: Simulation and Analysis. ICDE 1992: 273-282 BibTeX
[9]
Betty Salzberg: Merging Sorted Runs Using Large Main Memory. Acta Inf. 27(3): 195-215(1989) BibTeX
[10]
Paul R. Wilson, Mark S. Johnstone, Michael Neely, David Boles: Dynamic Storage Allocation: A Survey and Critical Review. IWMM 1995: 1-116 BibTeX
[11]
LuoQuan Zheng, Per-Åke Larson: Speeding up External Mergesort. IEEE Trans. Knowl. Data Eng. 8(2): 322-332(1996) BibTeX

Referenced by

  1. Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, David A. Wood: DBMSs on a Modern Processor: Where Does Time Go? VLDB 1999: 266-277
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:45 2009