A Super Scalar Sort Algorithm for RISC Processors.

Ramesh C. Agarwal: A Super Scalar Sort Algorithm for RISC Processors. SIGMOD Conference 1996: 240-246

@inproceedings{DBLP:conf/sigmod/Agarwal96,
  author    = {Ramesh C. Agarwal},
  editor    = {H. V. Jagadish and
               Inderpal Singh Mumick},
  title     = {A Super Scalar Sort Algorithm for RISC Processors},
  booktitle = {Proceedings of the 1996 ACM SIGMOD International Conference on
               Management of Data, Montreal, Quebec, Canada, June 4-6, 1996},
  publisher = {ACM Press},
  year      = {1996},
  pages     = {240-246},
  ee        = {http://doi.acm.org/10.1145/233269.233336, db/conf/sigmod/Agarwal96.html},
  crossref  = {DBLP:conf/sigmod/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

BibTeX

Abstract

The compare and branch sequences required in a traditional sort algorithm can not efficiently exploit multiple execution units present in currently available high performance RISC processors. This is because of the long latency of the compare instructions and the sequential algorithm used in sorting. With the increased level of integration on a chip, this trend is expected to continue. We have developed new sort algorithms which eliminate almost all the compares, provide functional parallelism which can be exploited by multiple execution units, significantly reduce the number of passes through keys, and improve data locality. These new algorithms outperform traditional sort algorithms by a large factor.

For the Datamation disk to disk sort benchmark (one million 100-byte records), at SIGMOD'94, Chris Nyberg et al presented several new performance records using DEC alha processor based systems.

We have implemented the Datamation sort benchmark using our new sort algorithm on a desktop IBM RS/6000 model 39H (66.6 MHz) with 8 IBM SSA 7133 disk drives (total cost $73K). The total elappsed time for the 100 MB sort was 5.1 seconds (vs the old uni-processor record of 9.1 seconds). We have also established a new price performance record (0.2¢ vs the old record of 0.9¢, as the cost of the sort). The entire sort processing was overlapped with I/O. During the read phase, we achieved a sustained BW of 47 MB/sec and during the write phase, we achieved a sustained BW of 39 MB/sec. Key extraction and sorting of one million 10-byte keys took only 0.6 seconds of CPU time. The rest of the CPU time was used in moving records, servicing I/O, and other overheads.

Algorithmic details leading to this level of performance are described in this paper. A detailed analysis of the CPU time spent during various phases of the sort algorithm and I/O is also provided.

Copyright © 1996 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.

ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/smod9397 or /cdrom

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

BibTeX

Printed Edition

H. V. Jagadish, Inderpal Singh Mumick (Eds.): Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. ACM Press 1996 BibTeX , SIGMOD Record 25(2), June 1996
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 787 KB]

References

[1]: ...
[2]: Bjørn Arild W. Baugstø, Jarle Fredrik Greipsland: Parallel Sorting Methods for Large Data Volumes on a Hypercube Database Computer. IWDM 1989: 127-141 BibTeX
[3]: ...
[4]: Zarka Cvetanovic, Dileep Bhandarkar: Characterization of Alpha AXP Performance Using TP and SPEC Workloads. ISCA 1994: 60-70 BibTeX
[5]: David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider: Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting. PDIS 1991: 280-291 BibTeX
[6]: ...
[7]: Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[8]: Betty Salzberg, Alex Tsukerman, Jim Gray, Michael Stewart, Susan Uren, Bonnie Vaughan: FastSort: A Distributed Single-Input Single-Output External Sort. SIGMOD Conference 1990: 94-101 BibTeX
[9]: Chris Nyberg, Tom Barclay, Zarka Cvetanovic, Jim Gray, David B. Lomet: AlphaSort: A RISC Machine Sort. SIGMOD Conference 1994: 233-242 BibTeX
[10]: Chris Nyberg, Tom Barclay, Zarka Cvetanovic, Jim Gray, David B. Lomet: AlphaSort: A Cache-Sensitive Parallel External Sort. VLDB J. 4(4): 603-627(1995) BibTeX
[11]: ...
[12]: ...
[13]: ...
[14]: ...
[15]: ...

Referenced by

Lars Arge, Octavian Procopiuc, Sridhar Ramaswamy, Torsten Suel, Jeffrey Scott Vitter: Scalable Sweeping-Based Spatial Join. VLDB 1998: 570-581
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, David A. Patterson: High-Performance Sorting on Networks of Workstations. SIGMOD Conference 1997: 243-254

BibTeX

ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]