ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Range Queries in OLAP Data Cubes.

Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant: Range Queries in OLAP Data Cubes. SIGMOD Conference 1997: 73-88
@inproceedings{DBLP:conf/sigmod/HoAMS97,
  author    = {Ching-Tien Ho and
               Rakesh Agrawal and
               Nimrod Megiddo and
               Ramakrishnan Srikant},
  editor    = {Joan Peckham},
  title     = {Range Queries in OLAP Data Cubes},
  booktitle = {SIGMOD 1997, Proceedings ACM SIGMOD International Conference
               on Management of Data, May 13-15, 1997, Tucson, Arizona, USA},
  publisher = {ACM Press},
  year      = {1997},
  pages     = {73-88},
  ee        = {http://doi.acm.org/10.1145/253260.253274, db/conf/sigmod/HoAMS97.html},
  crossref  = {DBLP:conf/sigmod/97},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

A range query applies an aggregation operation over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. We present fast algorithms for range queries for two types of aggregation operations: SUM and MAX. These two operations cover techniques required for most popular aggregation operations, such as those supported by SQL.

For range-sum queries, the essential idea is to precompute some auxiliary information (prefix sums) that is used to answer ad hoc queries at run-time. By maintaining auxiliary information which is of the same size as the data cube, all range queries for a given cube can be answered in constant time, irrespective of the size of the sub-cube circumscribed by a query. Alternatively, one can keep auxiliary information which is 1/bd of the size of the d-dimensional data cube. Response to a range query may now require access to some cells of the data cube in addition to the access to the auxiliary information, but the overall time complexity is typically reduced significantly. We also discuss how the precomputed information is incrementally updated by batching updates to the data cube. Finally, we present algorithms for choosing the subset of the data cube dimensions for which the auxiliary information is computed and the blocking factor to use for each such subset.

Our approach to answering range-max queries is based on precomputed max over balanced hierarchical tree structures. We use a branch-and-bound-like procedure to speed up the finding of max in a region. We also show that with a branch-and-bound procedure, the average-case complexity is much smaller than the worst-case complexity.

Copyright © 1997 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Joan Peckham (Ed.): SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. ACM Press 1997 BibTeX , SIGMOD Record 26(2), June 1997
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1867 KB]

References

[AAD+96]
Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521 BibTeX
[AGS97]
Rakesh Agrawal, Ashish Gupta, Sunita Sarawagi: Modeling Multidimensional Databases. ICDE 1997: 232-243 BibTeX
[Ben80]
Jon Louis Bentley: Multidimensional Divide-and-Conquer. Commun. ACM 23(4): 214-229(1980) BibTeX
[BF79]
Jon Louis Bentley, Jerome H. Friedman: Data Structures for Range Searching. ACM Comput. Surv. 11(4): 397-409(1979) BibTeX
[BKSS90]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331 BibTeX
[BR91]
...
[Cha90]
Bernard Chazelle: Lower Bounds for Orthogonal Range Searching II. The Arithmetic Model. J. ACM 37(3): 439-463(1990) BibTeX
[CLS86]
...
[CM89]
Meng Chang Chen, Lawrence McNamee: On the Data Model and Access Method of Summary Data Management. IEEE Trans. Knowl. Data Eng. 1(4): 519-529(1989) BibTeX
[Cod93]
...
[Col96]
George Colliat: OLAP, Relational, and Multidimensional Database Systems. SIGMOD Record 25(3): 64-69(1996) BibTeX
[Com79]
Douglas Comer: The Ubiquitous B-Tree. ACM Comput. Surv. 11(2): 121-137(1979) BibTeX
[CR89]
...
[CS94]
Surajit Chaudhuri, Kyuseok Shim: Including Group-By in Query Optimization. VLDB 1994: 354-366 BibTeX
[GBLP96]
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159 BibTeX
[GHQ95]
Ashish Gupta, Venky Harinarayan, Dallan Quass: Aggregate-Query Processing in Data Warehousing Environments. VLDB 1995: 358-369 BibTeX
[GHRU97]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219 BibTeX
[HBA97]
Ching-Tien Ho, Jehoshua Bruck, Rakesh Agrawal: Partial-Sum Queries in Data Cubes Using Covering Codes. PODS 1997: 228-237 BibTeX
[HRU96]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 BibTeX
[IBM95]
...
[JD88]
Anil K. Jain, Richard C. Dubes: Algorithms for Clustering Data. Prentice-Hall 1988
BibTeX
[JS96]
...
[Lom95]
...
[Meh84]
...
[Mic92]
Zbigniew Michalewicz: Statistical and Scientific Databases. Ellis Horwood 1992
BibTeX
[Mit70]
...
[OLA96]
...
[RND77]
...
[Sam89]
...
[SAM96]
John C. Shafer, Rakesh Agrawal, Manish Mehta: SPRINT: A Scalable Parallel Classifier for Data Mining. VLDB 1996: 544-555 BibTeX
[SB95]
...
[SDNR96]
Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy: Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. VLDB 1996: 522-531 BibTeX
[SR96]
...
[STL89]
Jaideep Srivastava, Jack S. Eddy Tan, Vincent Y. Lum: TBSAM: An Access Method for Efficient Processing of Statistical Queries. IEEE Trans. Knowl. Data Eng. 1(4): 414-423(1989) BibTeX
[Vai85]
Pravin M. Vaidya: Space-Time Tradeoffs for Orthogonal Range Queries (Extended Abstract). STOC 1985: 169-174 BibTeX
[WL85]
Dan E. Willard, George S. Lueker: Adding Range Restriction Capability to Dynamic Data Structures. J. ACM 32(3): 597-617(1985) BibTeX
[Yao85]
Andrew Chi-Chih Yao: On the Complexity of Maintaining Partial Sums. SIAM J. Comput. 14(2): 277-288(1985) BibTeX
[YL95]
Weipeng P. Yan, Per-Åke Larson: Eager Aggregation and Lazy Aggregation. VLDB 1995: 345-357 BibTeX

Referenced by

  1. Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi: Flexible Data Cubes for Online Aggregation. ICDT 2001: 159-173
  2. Chung Keung Poon: Orthogonal Range Queries in OLAP. ICDT 2001: 361-374
  3. Steven Geffner, Divyakant Agrawal, Amr El Abbadi: The Dynamic Data Cube. EDBT 2000: 237-253
  4. Theodore Johnson, Dennis Shasha: Some Approaches to Index Design for Cude Forests. IEEE Data Eng. Bull. 22(4): 22-30(1999)
  5. Steven Geffner, Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi: Data Cubes in Dynamic Environments. IEEE Data Eng. Bull. 22(4): 31-40(1999)
  6. Chee Yong Chan, Yannis E. Ioannidis: Hierarchical Prefix Cubes for Range-Sum Queries. VLDB 1999: 675-686
  7. Jeffrey Scott Vitter, Min Wang: Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. SIGMOD Conference 1999: 193-204
  8. Steven Geffner, Divyakant Agrawal, Amr El Abbadi, Terence R. Smith: Relative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes. ICDE 1999: 328-335
  9. Zuotao Li, Xiaoyang Sean Wang, Menas Kafatos, Ruixin Yang: A Pyramid Data Model for Supporting Content-Based Browsing and Knowledge Discovery. SSDBM 1998: 170-179
  10. Yossi Matias, Jeffrey Scott Vitter, Min Wang: Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference 1998: 448-459
  11. Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. SIGMOD Conference 1998: 94-105
  12. John R. Smith, Chung-Sheng Li, Vittorio Castelli, Anant Jhingran: Dynamic Assembly of Views in Data Cubes. PODS 1998: 274-283
  13. Shin-Chung Shao: Multivariate and Multidimensional OLAP. EDBT 1998: 120-134
  14. Ching-Tien Ho, Jehoshua Bruck, Rakesh Agrawal: Partial-Sum Queries in Data Cubes Using Covering Codes. PODS 1997: 228-237
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:36 2009