Review - An Array-Based Algorithm for Simultaneous Multidimensional Aggregates.

Jiawei Han: Review - An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. ACM SIGMOD Digital Review 1: (1999) BibTeX

Review

This is a milestone paper on implementation of data warehouse using MOLAP technology or more concretely on data cube computation.

It shows how array-based aggregation of multi-dimensional data warehouse can be implemented efficiently using sparse array technique, plus their new chunk-based compression technique, and appropriate ordering of memory-resident portion of dimensions in the multiway, and multi-dimensional array aggregation computation.

The paper also compares ROLAP-based computation and shows the high performance of their array-based aggregation method.

The presentation is crystal clear and I really enjoy reading the paper!

I should note that complete computation of data cube may lead to explosive size of a data cube especially when the number of dimensions grows. Therefore, there is a serious limitation on the number of dimensions a MOLAP method or any other method may handle. This problem has been addressed in Ken Ross' VLDB'97 paper [2]. Another interesting solution has been proposed at SIGMOD'99 by Kevin Beyer and Raghu Ramakrishnan [3].

I have not seen new work in this direction: some nice results on array-based computation of ICEBERG cubes. Can someone give more pointers on it?

References

[1]: Yihong Zhao, Prasad Deshpande, Jeffrey F. Naughton: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. SIGMOD Conference 1997: 159-170 BibTeX
[2]: Kenneth A. Ross, Divesh Srivastava: Fast Computation of Sparse Datacubes. VLDB 1997: 116-125 BibTeX
[3]: Kevin S. Beyer, Raghu Ramakrishnan: Bottom-Up Computation of Sparse and Iceberg CUBEs. SIGMOD Conference 1999: 359-370 BibTeX

BibTeX

Digital Review - DBLP: [Home | Search: Author, Title | Conferences | Journals]