ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Rearranging Data to Maximize the Efficiency of Compression.

Frank Olken, Doron Rotem: Rearranging Data to Maximize the Efficiency of Compression. PODS 1986: 78-90
@inproceedings{DBLP:conf/pods/OlkenR86,
  author    = {Frank Olken and
               Doron Rotem},
  title     = {Rearranging Data to Maximize the Efficiency of Compression},
  booktitle = {Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles
               of Database Systems, March 24-26, 1986, Cambridge, Massachusetts},
  publisher = {ACM},
  year      = {1986},
  isbn      = {0-89791-179-2},
  pages     = {78-90},
  ee        = {http://doi.acm.org/10.1145/6012.15407, db/conf/pods/OlkenR86.html},
  crossref  = {DBLP:conf/pods/86},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We are concerned with rearranging the order in which data is stored in a database so as to maximize the amount of compression. We consider multi-attribute data defined over finite discrete domains (called categories) and seek an optimal permutation of the categories for each attribute. We study both deterministic and probabilistic models of data distribution.

We show that the deterministic category rearrangement problem is NP-complete (for run length encoding compression methods) via a reduction to the rectilinear traveling salesman problem. For the probabilistic model, we show that the optimal category rearrangement for each attribute has the form of a double pipe organ, which is a generalized version of the well-known pipe organ arrangement found in earlier work on record placement on disk cylinders. For k-dimensional data we obtain an algorithm for finding the optimal arrangement which is O(n2) for fixed k.

Finally, some open problems are stated.

Copyright © 1986 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 24-26, 1986, Cambridge, Massachusetts. ACM 1986, ISBN 0-89791-179-2
Contents BibTeX

Online Edition: ACM Digital Library

Journal Version

Frank Olken, Doron Rotem: Rearranging Data to Maximize the Efficiency of Compression. J. Comput. Syst. Sci. 38(2): 405-430(1989) BibTeX

References

[Bat79]
Don S. Batory: On Searching Transposed Files. ACM Trans. Database Syst. 4(4): 531-544(1979) BibTeX
[Ber72]
Patrick P. Bergmans: Minimizing Expected Travel Time on Geometrical Patterns by Optimal Probability Rearrangements. Information and Control 20(4): 331-350(1972) BibTeX
[Bia69]
...
[BK74]
...
[But71]
...
[EOS81]
Susan J. Eggers, Frank Olken, Arie Shoshani: A Compression Technique for Large Statistical Data-Bases. VLDB 1981: 424-434 BibTeX
[ES80]
Susan J. Eggers, Arie Shoshani: Efficient Access of Compressed Data. VLDB 1980: 205-211 BibTeX
[GGJ76]
M. R. Garey, Ronald L. Graham, David S. Johnson: Some NP-Complete Geometric Problems. STOC 1976: 10-22 BibTeX
[Gol66]
...
[HLP52]
...
[HS77]
...
[PAB68]
...
[Wie83]
...
[Won80]
C. K. Wong: Minimizing Expected Head Movement in One-Dimensional and Two-Dimensional Mass Storage Systems. ACM Comput. Surv. 12(2): 167-178(1980) BibTeX
[Won83]
...

Referenced by

  1. H. V. Jagadish, J. Madar, Raymond T. Ng: Semantic Compression and Pattern Extraction with Fascicles. VLDB 1999: 186-198
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:33:49 2009