Percentile Finding Algorithm for Multiple Sorted Runs.
Balakrishna R. Iyer, Gary R. Ricard, Peter J. Varman:
Percentile Finding Algorithm for Multiple Sorted Runs.
VLDB 1989: 135-144@inproceedings{DBLP:conf/vldb/IyerRV89,
author = {Balakrishna R. Iyer and
Gary R. Ricard and
Peter J. Varman},
editor = {Peter M. G. Apers and
Gio Wiederhold},
title = {Percentile Finding Algorithm for Multiple Sorted Runs},
booktitle = {Proceedings of the Fifteenth International Conference on Very
Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands},
publisher = {Morgan Kaufmann},
year = {1989},
isbn = {1-55860-101-5},
pages = {135-144},
ee = {db/conf/vldb/IyerRV89.html},
crossref = {DBLP:conf/vldb/89},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
External sorting is frequently used by relational database systems for buildingindexes on tables, ordered retrieval, duplicate elimination, joins, subqueries,grouping, and aggregation; it would be quite beneficial to parallelize this function.
Previous parallel external sorting algorithms found in the database literature used a sequential merge as the final stage of the parallel sort.
This reduces the speedup gained through parallelism in earlier stages of sort.
The solution is to merge in parallel as well.
Load balanced parallel two way merges and approximately load balanced parallel multi way merges are known.
Measurements reported on parallel sorting that employs one of the approximate partitioning methods indicate that even if the sort keys are randomly distributed the load imbalance due to the approximation degrades speedup due to parallelism.
Sort key value skews, known to occur in database workloads, can only exacerbatethis problem.
We give, prove and analyze an efficient exact method which can find any percentile of an arbitrary number of sorted runs.
Application of our algorithm ensures load balance during the parallel merge.
By removing the effect of skews of sort key values which caused loss of speed up in previous approaches our method can improve the speedup for parallel sorting on multiple processors.
While we target our work to a parallel computer architecture of shared memory MIMD parallel processors, our results are also likely to be useful for other parallel computer architectures.
Copyright © 1989 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Peter M. G. Apers, Gio Wiederhold (Eds.):
Proceedings of the Fifteenth International Conference on Very Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands.
Morgan Kaufmann 1989, ISBN 1-55860-101-5
BibTeX
References
- [AKL87]
- Selim G. Akl, Nicola Santoro:
Optimal Parallel Merging and Sorting Without Memory Conflicts.
IEEE Trans. Computers 36(11): 1367-1369(1987) BibTeX
- [BAT68]
- Kenneth E. Batcher:
Sorting Networks and Their Applications.
AFIPS Spring Joint Computing Conference 1968: 307-314 BibTeX
- [BEC88]
- Micah Beck, Dina Bitton, W. Kevin Wilkinson:
Sorting Large Files on a Backend Multiprocessor.
IEEE Trans. Computers 37(7): 769-778(1988) BibTeX
- [BIT83]
- Dina Bitton, Haran Boral, David J. DeWitt, W. Kevin Wilkinson:
Parallel Algorithms for the Execution of Relational Database Operations.
ACM Trans. Database Syst. 8(3): 324-353(1983) BibTeX
- [DEO88]
- ...
- [FRA88]
- Rhys S. Francis, Ian D. Mathieson:
A Benchmark Parallel Sort for Shared Memory Multiprocessors.
IEEE Trans. Computers 37(12): 1619-1626(1988) BibTeX
- [FRE82]
- Greg N. Frederickson, Donald B. Johnson:
The Complexity of Selection and Ranking in X+Y and Matrices with Sorted Columns.
J. Comput. Syst. Sci. 24(2): 197-208(1982) BibTeX
- [GRA86]
- ...
- [IYE88]
- ...
- [IYE89]
- ...
- [LOR88]
- Raymond A. Lorie, Honesty C. Young:
A Low Communication Sort Algorithm for a Parallel Database Machine.
VLDB 1989: 125-134 BibTeX
- [QUI88]
- ...
- [KNU73]
- Donald E. Knuth:
The Art of Computer Programming, Volume III: Sorting and Searching.
Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
- [OLK86]
- Frank Olken, Doron Rotem:
Simple Random Sampling from Relational Databases.
VLDB 1986: 160-169 BibTeX
- [VAL84]
- Patrick Valduriez, Georges Gardarin:
Join and Semijoin Algorithms for a Multiprocessor Database Machine.
ACM Trans. Database Syst. 9(1): 133-161(1984) BibTeX
- [VAR88]
- ...
- [VIT85]
- Jeffrey Scott Vitter:
Random Sampling with a Reservoir.
ACM Trans. Math. Softw. 11(1): 37-57(1985) BibTeX
Referenced by
- LuoQuan Zheng, Per-Åke Larson:
Speeding up External Mergesort.
IEEE Trans. Knowl. Data Eng. 8(2): 322-332(1996)
- Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu:
Optimization of Parallel Execution for Multi-Join Queries.
IEEE Trans. Knowl. Data Eng. 8(3): 416-428(1996)
- Joel L. Wolf, Balakrishna R. Iyer, Krishna R. Pattipati, John Turek:
Optimal Buffer Partitioning for the Nested Block Join Algorithm.
ICDE 1991: 510-519
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:40 2009