ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies.

Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy: Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. VLDB 1996: 522-531
@inproceedings{DBLP:conf/vldb/ShuklaDNR96,
  author    = {Amit Shukla and
               Prasad Deshpande and
               Jeffrey F. Naughton and
               Karthikeyan Ramasamy},
  editor    = {T. M. Vijayaraman and
               Alejandro P. Buchmann and
               C. Mohan and
               Nandlal L. Sarda},
  title     = {Storage Estimation for Multidimensional Aggregates in the Presence
               of Hierarchies},
  booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
               Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
  publisher = {Morgan Kaufmann},
  year      = {1996},
  isbn      = {1-55860-382-4},
  pages     = {522-531},
  ee        = {db/conf/vldb/ShuklaDNR96.html},
  crossref  = {DBLP:conf/vldb/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

To speed up multidimensional data analysis, database systems frequently precompute aggregates on some subsets of dimensions and their corresponding hierarchies. This improves query response time. However, the decision of what and how much to precompute is a difficult one. It is further complicated by the fact that precomputation in the presence of hierarchies can result in an unintuitively large increase in the amount of storage required by the database. Hence, it is interesting and useful to estimate the storage blowup that will result from a proposed set of precomputations without actually computing them. We propose three strategies for this problem: one based on sampling, one mathematical approximation, and one based on probabilistic counting. We investigate the accuracy of these algorithms in estimating the blowup for different data distributions and database schemas. The algorithm based upon probabilistic counting is particularly attractive, since it estimates the storage blowup to within provable error bounds while performing only a single scan of the data.

Copyright © 1996 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.): VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India. Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents BibTeX

Electronic Edition

References

[AAD+96]
Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521 BibTeX
[AGS95]
...
[CCS93]
...
[Fel57]
...
[FM85]
Philippe Flajolet, G. Nigel Martin: Probabilistic Counting Algorithms for Data Base Applications. J. Comput. Syst. Sci. 31(2): 182-209(1985) BibTeX
[GBLP96]
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159 BibTeX
[HNSS93]
...
[HNSS95]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes: Sampling-Based Estimation of the Number of Distinct Values of an Attribute. VLDB 1995: 311-322 BibTeX
[HRU96]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 BibTeX
[KT95]
...
[Str95]
...
[Zip49]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
BibTeX

Referenced by

  1. Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambrusch, Andrew Rau-Chaplin: Parallelizing the Data Cube. ICDT 2001: 129-143
  2. Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton: Materialized View Selection for Multi-Cube Data Models. EDBT 2000: 269-284
  3. Steven Geffner, Divyakant Agrawal, Amr El Abbadi: The Dynamic Data Cube. EDBT 2000: 237-253
  4. Torben Bach Pedersen, Christian S. Jensen, Curtis E. Dyreson: Extending Practical Pre-Aggregation in On-Line Analytical Processing. VLDB 1999: 663-674
  5. Hidetoshi Uchiyama, Kanda Runapongsa, Toby J. Teorey: A Progressive View Materialization Algorithm. DOLAP 1999: 36-41
  6. Seigo Muto, Masaru Kitsuregawa: A Dynamic Load Balancing Strategy for Parallel Datacube Computation. DOLAP 1999: 67-72
  7. Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton: Materialized View Selection for Multidimensional Datasets. VLDB 1998: 488-499
  8. Guido Moerkotte: Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing. VLDB 1998: 476-487
  9. Prasad Deshpande, Karthikeyan Ramasamy, Amit Shukla, Jeffrey F. Naughton: Caching Multidimensional Queries Using Chunks. SIGMOD Conference 1998: 259-270
  10. Charu C. Aggarwal, Philip S. Yu: Online Generation of Association Rules. ICDE 1998: 402-411
  11. Curtis E. Dyreson: Using an Incomplete Data Cube as a Summary Data Sieve. IEEE Data Eng. Bull. 20(1): 19-26(1997)
  12. Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy, Amit Shukla, Kristin Tufte, Yihong Zhao: Cubing Algorithms, Storage Estimation, and Storage and Processing Alternatives for OLAP. IEEE Data Eng. Bull. 20(1): 3-11(1997)
  13. Elena Baralis, Stefano Paraboschi, Ernest Teniente: Materialized Views Selection in a Multidimensional Database. VLDB 1997: 156-165
  14. Nick Roussopoulos, Yannis Kotidis, Mema Roussopoulos: Cubetree: Organization of and Bulk Updates on the Data Cube. SIGMOD Conference 1997: 89-99
  15. Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant: Range Queries in OLAP Data Cubes. SIGMOD Conference 1997: 73-88
  16. Ching-Tien Ho, Jehoshua Bruck, Rakesh Agrawal: Partial-Sum Queries in Data Cubes Using Covering Codes. PODS 1997: 228-237
  17. Rakesh Agrawal, Ashish Gupta, Sunita Sarawagi: Modeling Multidimensional Databases. ICDE 1997: 232-243
  18. Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:12 2009