On the Computation of Multidimensional Aggregates.
Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi:
On the Computation of Multidimensional Aggregates.
VLDB 1996: 506-521@inproceedings{DBLP:conf/vldb/AgarwalADGNRS96,
author = {Sameet Agarwal and
Rakesh Agrawal and
Prasad Deshpande and
Ashish Gupta and
Jeffrey F. Naughton and
Raghu Ramakrishnan and
Sunita Sarawagi},
editor = {T. M. Vijayaraman and
Alejandro P. Buchmann and
C. Mohan and
Nandlal L. Sarda},
title = {On the Computation of Multidimensional Aggregates},
booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
publisher = {Morgan Kaufmann},
year = {1996},
isbn = {1-55860-382-4},
pages = {506-521},
ee = {db/conf/vldb/AgarwalADGNRS96.html},
crossref = {DBLP:conf/vldb/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
At the heart of all OLAP or multidimensional
data analysis is the ability to simultaneously
aggregate across many sets of dimensions.
Computing multidimensional aggregates is a performance
bottleneck for these applications.
This paper presents fast algorithms for computing a
collection of group-bys.
We focus on a special case of the aggregation problem -
computation of the CUBE operator.
The CUBE operator requires computing group-bys on all possible
combinations of a list of attributes, and is equivalent to the union
of a number of standard group-by operations.
We show how the structure of CUBE computations
can be viewed in terms of a hierarchy of group-by operations.
Our algorithms extend sort-based and hash-based grouping methods
with several optimizations, like combining common operations across
multiple group-bys, caching, and using pre-computed group-bys for
computing other group-bys.
Empirical evaluation shows that the resulting algorithms
give much better performance compared to straightforward methods.
This paper combines work done concurrently on
computing the data cube by two different teams
as reported in
[SAG96] and
[DANR96].
Copyright © 1996 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.):
VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India.
Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents BibTeX
Electronic Edition
References
- [CM89]
- Meng Chang Chen, Lawrence McNamee:
On the Data Model and Access Method of Summary Data Management.
IEEE Trans. Knowl. Data Eng. 1(4): 519-529(1989) BibTeX
- [GBLP96]
- Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh:
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total.
ICDE 1996: 152-159 BibTeX
- [GJ79]
- M. R. Garey, David S. Johnson:
Computers and Intractability: A Guide to the Theory of NP-Completeness.
W. H. Freeman 1979, ISBN 0-7167-1044-7
BibTeX
- [GLS94]
- Goetz Graefe, Ann Linville, Leonard D. Shapiro:
Sort versus Hash Revisited.
IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994) BibTeX
- [Gra93]
- Goetz Graefe:
Query Evaluation Techniques for Large Databases.
ACM Comput. Surv. 25(2): 73-170(1993) BibTeX
- [HNSS95]
- Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes:
Sampling-Based Estimation of the Number of Distinct Values of an Attribute.
VLDB 1995: 311-322 BibTeX
- [JS96]
- ...
- [PS82]
- Christos H. Papadimitriou, Kenneth Steiglitz:
Combinatorial Optimization: Algorithms and Complexity.
Prentice-Hall 1982, ISBN 0-13-152462-3
BibTeX
- [FELL57]
- ...
- [HRU96]
- Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman:
Implementing Data Cubes Efficiently.
SIGMOD Conference 1996: 205-216 BibTeX
- [GHRU96]
- ...
- [EPST79]
- ...
- [SN95]
- Ambuj Shatdal, Jeffrey F. Naughton:
Adaptive Parallel Aggregation Algorithms.
SIGMOD Conference 1995: 104-114 BibTeX
- [SDNR96]
- Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy:
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies.
VLDB 1996: 522-531 BibTeX
- [SAG96]
- ...
- [DANR96]
- ...
- [Mic92]
- Zbigniew Michalewicz:
Statistical and Scientific Databases.
Ellis Horwood 1992
BibTeX
- [NC95]
- ...
- [CODD93]
- ...
- [FINK]
- ...
- [Sho82]
- Arie Shoshani:
Statistical Databases: Characteristics, Problems, and some Solutions.
VLDB 1982: 208-222 BibTeX
- [SR96]
- ...
- [STG95]
- ...
- [STL89]
- Jaideep Srivastava, Jack S. Eddy Tan, Vincent Y. Lum:
TBSAM: An Access Method for Efficient Processing of Statistical Queries.
IEEE Trans. Knowl. Data Eng. 1(4): 414-423(1989) BibTeX
- [WELD95]
- ...
Referenced by
- Carlos A. Hurtado, Alberto O. Mendelzon:
Reasoning about Summarizability in Heterogeneous Multidimensional Schemas.
ICDT 2001: 375-389
- Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambrusch, Andrew Rau-Chaplin:
Parallelizing the Data Cube.
ICDT 2001: 129-143
- Weifa Liang, Maria E. Orlowska, Jeffrey Xu Yu:
Optimizing Multiple Dimensional Queries Simultaneously in Multidimensional Databases.
VLDB J. 8(3-4): 319-338(2000)
- Themistoklis Palpanas:
Knowledge Discovery in Data Warehouses.
SIGMOD Record 29(3): 88-100(2000)
- Stéphane Grumbach, Leonardo Tininini:
On the Content of Materialized Aggregate Views.
PODS 2000: 47-57
- Steven Geffner, Divyakant Agrawal, Amr El Abbadi:
The Dynamic Data Cube.
EDBT 2000: 237-253
- Prasad Deshpande, Jeffrey F. Naughton:
Aggregate Aware Caching for Multi-Dimensional Queries.
EDBT 2000: 167-182
- Peter A. Boncz, Martin L. Kersten:
MIL Primitives for Querying a Fragmented World.
VLDB J. 8(2): 101-119(1999)
- Jianzhong Li, Doron Rotem, Jaideep Srivastava:
Aggregation Algorithms for Very Large Compressed Data Warehouses.
VLDB 1999: 651-662
- H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava:
What can Hierarchies do for Data Warehouses?
VLDB 1999: 530-541
- Damianos Chatziantoniou:
Evaluation of Ad Hoc OLAP: In-Place Computation.
SSDBM 1999: 34-43
- Jeffrey Scott Vitter, Min Wang:
Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets.
SIGMOD Conference 1999: 193-204
- Yannis Kotidis, Nick Roussopoulos:
DynaMat: A Dynamic View Management System for Data Warehouses.
SIGMOD Conference 1999: 371-382
- Kevin S. Beyer, Raghu Ramakrishnan:
Bottom-Up Computation of Sparse and Iceberg CUBEs.
SIGMOD Conference 1999: 359-370
- Carlos A. Hurtado, Alberto O. Mendelzon, Alejandro A. Vaisman:
Maintaining Data Cubes under Dimension Updates.
ICDE 1999: 346-355
- Seigo Muto, Masaru Kitsuregawa:
A Dynamic Load Balancing Strategy for Parallel Datacube Computation.
DOLAP 1999: 67-72
- Jiawei Han:
Towards On-Line Analytical Mining in Large Databases.
SIGMOD Record 27(1): 97-107(1998)
- Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton:
Materialized View Selection for Multidimensional Datasets.
VLDB 1998: 488-499
- Frédéric Gingras, Laks V. S. Lakshmanan:
nD-SQL: A Multi-Dimensional Language for Interoperability and OLAP.
VLDB 1998: 134-145
- Yannis Kotidis, Nick Roussopoulos:
An Alternative Storage Organization for ROLAP Aggregate Views Based on Cubetrees.
SIGMOD Conference 1998: 249-258
- Prasad Deshpande, Karthikeyan Ramasamy, Amit Shukla, Jeffrey F. Naughton:
Caching Multidimensional Queries Using Chunks.
SIGMOD Conference 1998: 259-270
- John R. Smith, Chung-Sheng Li, Vittorio Castelli, Anant Jhingran:
Dynamic Assembly of Views in Data Cubes.
PODS 1998: 274-283
- Charu C. Aggarwal, Philip S. Yu:
Online Generation of Association Rules.
ICDE 1998: 402-411
- Shin-Chung Shao:
Multivariate and Multidimensional OLAP.
EDBT 1998: 120-134
- Sunita Sarawagi, Rakesh Agrawal, Nimrod Megiddo:
Discovery-Driven Exploration of OLAP Data Cubes.
EDBT 1998: 168-182
- Kenneth A. Ross, Divesh Srivastava, Damianos Chatziantoniou:
Complex Aggregation at Multiple Granularities.
EDBT 1998: 263-277
- Takeshi Fukuda, Hirofumi Matsuzawa:
Parallel Processing of Multiple Aggregate Queries on Shared-Nothing Multiprocessors.
EDBT 1998: 278-292
- Seigo Muto, Masaru Kitsuregawa:
Improving Main Memory Utilization for Array-Based DataCube Computation.
DOLAP 1998: 28-33
- Sunita Sarawagi:
Indexing OLAP Data.
IEEE Data Eng. Bull. 20(1): 36-43(1997)
- Venky Harinarayan:
Issues in Interactive Aggregation.
IEEE Data Eng. Bull. 20(1): 12-18(1997)
- Curtis E. Dyreson:
Using an Incomplete Data Cube as a Summary Data Sieve.
IEEE Data Eng. Bull. 20(1): 19-26(1997)
- Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy, Amit Shukla, Kristin Tufte, Yihong Zhao:
Cubing Algorithms, Storage Estimation, and Storage and Processing Alternatives for OLAP.
IEEE Data Eng. Bull. 20(1): 3-11(1997)
- Kenneth A. Ross, Divesh Srivastava:
Fast Computation of Sparse Datacubes.
VLDB 1997: 116-125
- Marc Gyssens, Laks V. S. Lakshmanan:
A Foundation for Multi-dimensional Databases.
VLDB 1997: 106-115
- Elena Baralis, Stefano Paraboschi, Ernest Teniente:
Materialized Views Selection in a Multidimensional Database.
VLDB 1997: 156-165
- Yihong Zhao, Prasad Deshpande, Jeffrey F. Naughton:
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates.
SIGMOD Conference 1997: 159-170
- Nick Roussopoulos, Yannis Kotidis, Mema Roussopoulos:
Cubetree: Organization of and Bulk Updates on the Data Cube.
SIGMOD Conference 1997: 89-99
- Inderpal Singh Mumick, Dallan Quass, Barinderpal Singh Mumick:
Maintenance of Data Cubes and Summary Tables in a Warehouse.
SIGMOD Conference 1997: 100-111
- Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant:
Range Queries in OLAP Data Cubes.
SIGMOD Conference 1997: 73-88
- Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang:
Online Aggregation.
SIGMOD Conference 1997: 171-182
- Ching-Tien Ho, Jehoshua Bruck, Rakesh Agrawal:
Partial-Sum Queries in Data Cubes Using Covering Codes.
PODS 1997: 228-237
- Rakesh Agrawal, Ashish Gupta, Sunita Sarawagi:
Modeling Multidimensional Databases.
ICDE 1997: 232-243
- Luca Cabibbo, Riccardo Torlone:
Querying Multidimensional Databases.
DBPL 1997: 319-335
- Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy:
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies.
VLDB 1996: 522-531
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:12 2009