Rearranging Data to Maximize the Efficiency of Compression.
Frank Olken, Doron Rotem:
Rearranging Data to Maximize the Efficiency of Compression.
PODS 1986: 78-90@inproceedings{DBLP:conf/pods/OlkenR86,
author = {Frank Olken and
Doron Rotem},
title = {Rearranging Data to Maximize the Efficiency of Compression},
booktitle = {Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles
of Database Systems, March 24-26, 1986, Cambridge, Massachusetts},
publisher = {ACM},
year = {1986},
isbn = {0-89791-179-2},
pages = {78-90},
ee = {http://doi.acm.org/10.1145/6012.15407, db/conf/pods/OlkenR86.html},
crossref = {DBLP:conf/pods/86},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
We are concerned with rearranging the order in which data is stored in a database so as to maximize the amount of compression. We consider multi-attribute data defined over finite discrete domains (called categories) and seek an optimal permutation of the categories for each attribute. We study both deterministic and probabilistic models of data distribution.
We show that the deterministic category rearrangement problem is NP-complete (for run length encoding compression methods) via a reduction to the rectilinear traveling salesman problem. For the probabilistic model, we show that the optimal category rearrangement for each attribute has the form of a double pipe organ, which is a generalized version of the well-known pipe organ arrangement found in earlier work on record placement on disk cylinders. For k-dimensional data we obtain an algorithm for finding the optimal arrangement which is O(n2) for fixed k.
Finally, some open problems are stated.
Copyright © 1986 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98.
and ...
Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings.
and ...
BibTeX
Printed Edition
Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 24-26, 1986, Cambridge, Massachusetts.
ACM 1986, ISBN 0-89791-179-2
Contents BibTeX
Journal Version
Frank Olken, Doron Rotem:
Rearranging Data to Maximize the Efficiency of Compression.
J. Comput. Syst. Sci. 38(2): 405-430(1989) BibTeX
References
- [Bat79]
- Don S. Batory:
On Searching Transposed Files.
ACM Trans. Database Syst. 4(4): 531-544(1979) BibTeX
- [Ber72]
- Patrick P. Bergmans:
Minimizing Expected Travel Time on Geometrical Patterns by Optimal Probability Rearrangements.
Information and Control 20(4): 331-350(1972) BibTeX
- [Bia69]
- ...
- [BK74]
- ...
- [But71]
- ...
- [EOS81]
- Susan J. Eggers, Frank Olken, Arie Shoshani:
A Compression Technique for Large Statistical Data-Bases.
VLDB 1981: 424-434 BibTeX
- [ES80]
- Susan J. Eggers, Arie Shoshani:
Efficient Access of Compressed Data.
VLDB 1980: 205-211 BibTeX
- [GGJ76]
- M. R. Garey, Ronald L. Graham, David S. Johnson:
Some NP-Complete Geometric Problems.
STOC 1976: 10-22 BibTeX
- [Gol66]
- ...
- [HLP52]
- ...
- [HS77]
- ...
- [PAB68]
- ...
- [Wie83]
- ...
- [Won80]
- C. K. Wong:
Minimizing Expected Head Movement in One-Dimensional and Two-Dimensional Mass Storage Systems.
ACM Comput. Surv. 12(2): 167-178(1980) BibTeX
- [Won83]
- ...
Referenced by
- H. V. Jagadish, J. Madar, Raymond T. Ng:
Semantic Compression and Pattern Extraction with Fascicles.
VLDB 1999: 186-198
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:33:49 2009