ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Implementing Data Cubes Efficiently.

Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216
@inproceedings{DBLP:conf/sigmod/HarinarayanRU96,
  author    = {Venky Harinarayan and
               Anand Rajaraman and
               Jeffrey D. Ullman},
  editor    = {H. V. Jagadish and
               Inderpal Singh Mumick},
  title     = {Implementing Data Cubes Efficiently},
  booktitle = {Proceedings of the 1996 ACM SIGMOD International Conference on
               Management of Data, Montreal, Quebec, Canada, June 4-6, 1996},
  publisher = {ACM Press},
  year      = {1996},
  pages     = {205-216},
  ee        = {http://doi.acm.org/10.1145/233269.233333, db/conf/sigmod/HarinarayanRU96.html},
  crossref  = {DBLP:conf/sigmod/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cube. Each cell of the data cube is a view consisting of an aggregation of interest, like total sales. The values of many of these cells are dependent on the values of other cells in the data cube. A common and powerful query optimization technique is to materialize some or all of these cells rather than compute them from raw data each time. Commercial systems differ mainly in their approach to materializing the data cube. In this paper, we investigate the issue of which cells (views) to materialize when it is too expensive to materialize all views. A lattice framework is used to express dependencies among views. We present greedy algorithms that work off this lattice and determine a good set of views to materialize. The greedy algorithm performs within a small constant factor of optimal under a variety of models. We then consider the most commoncase of the hypercube lattice and examine the choice of materialized views for hypercubes in detail, giving some good tradeoffs between the space used and the average time to answer a query.

Copyright © 1996 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

H. V. Jagadish, Inderpal Singh Mumick (Eds.): Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. ACM Press 1996 BibTeX , SIGMOD Record 25(2), June 1996
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1206 KB]

References

[Arb]
...
[Che96]
...
[CS94]
Surajit Chaudhuri, Kyuseok Shim: Including Group-By in Query Optimization. VLDB 1994: 354-366 BibTeX
[Fei96]
...
[GBLP95]
...
[GHQ95]
Ashish Gupta, Venky Harinarayan, Dallan Quass: Aggregate-Query Processing in Data Warehousing Environments. VLDB 1995: 358-369 BibTeX
[GHRU96]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219 BibTeX
[Gra93]
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993) BibTeX
[HRU95]
...
[HNSS95]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes: Sampling-Based Estimation of the Number of Distinct Values of an Attribute. VLDB 1995: 311-322 BibTeX
[OG95]
Patrick E. O'Neil, Goetz Graefe: Multi-Table Joins Through Bitmapped Join Indices. SIGMOD Record 24(3): 8-11(1995) BibTeX
[Raa95]
...
[Rad95]
...
[STG]
...
[Xen94]
...

Referenced by

  1. Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi: Flexible Data Cubes for Online Aggregation. ICDT 2001: 159-173
  2. Carlos A. Hurtado, Alberto O. Mendelzon: Reasoning about Summarizability in Heterogeneous Multidimensional Schemas. ICDT 2001: 375-389
  3. Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambrusch, Andrew Rau-Chaplin: Parallelizing the Data Cube. ICDT 2001: 129-143
  4. Francesco Buccafurri, Filippo Furfaro, Domenico Saccà: Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach. ICDT 2001: 390-404
  5. Weifa Liang, Maria E. Orlowska, Jeffrey Xu Yu: Optimizing Multiple Dimensional Queries Simultaneously in Multidimensional Databases. VLDB J. 8(3-4): 319-338(2000)
  6. Themistoklis Palpanas: Knowledge Discovery in Data Warehouses. SIGMOD Record 29(3): 88-100(2000)
  7. Markos Zaharioudakis, Roberta Cochrane, George Lapis, Hamid Pirahesh, Monica Urata: Answering Complex SQL Queries Using Automatic Summary Tables. SIGMOD Conference 2000: 105-116
  8. Wolfgang Lehner, Richard Sidle, Hamid Pirahesh, Roberta Cochrane: Maintenance of Automatic Summary Tables. SIGMOD Conference 2000: 512-513
  9. Junghoo Cho, Hector Garcia-Molina: Synchronizing a Database to Improve Freshness. SIGMOD Conference 2000: 117-128
  10. Stéphane Grumbach, Leonardo Tininini: On the Content of Materialized Aggregate Views. PODS 2000: 47-57
  11. Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton: Materialized View Selection for Multi-Cube Data Models. EDBT 2000: 269-284
  12. Steven Geffner, Divyakant Agrawal, Amr El Abbadi: The Dynamic Data Cube. EDBT 2000: 237-253
  13. Prasad Deshpande, Jeffrey F. Naughton: Aggregate Aware Caching for Multi-Dimensional Queries. EDBT 2000: 167-182
  14. Márcio Farias de Souza, Marcus Costa Sampaio: Efficient Materialization and Use of Views in Data Warehouses. SIGMOD Record 28(1): 78-83(1999)
  15. Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis: Approximate Query Answering using Histograms. IEEE Data Eng. Bull. 22(4): 5-14(1999)
  16. Daniel Barbará, Xintao Wu: The Role of Approximations in Maintaining and Using Aggregate Views. IEEE Data Eng. Bull. 22(4): 15-21(1999)
  17. Torben Bach Pedersen, Christian S. Jensen, Curtis E. Dyreson: Extending Practical Pre-Aggregation in On-Line Analytical Processing. VLDB 1999: 663-674
  18. Jianzhong Li, Doron Rotem, Jaideep Srivastava: Aggregation Algorithms for Very Large Compressed Data Warehouses. VLDB 1999: 651-662
  19. Laks V. S. Lakshmanan, Fereidoon Sadri, Subbu N. Subramanian: On Efficiently Implementing SchemaSQL on an SQL Database System. VLDB 1999: 471-482
  20. H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava: What can Hierarchies do for Data Warehouses? VLDB 1999: 530-541
  21. Anindya Datta, Krithi Ramamritham, Helen M. Thomas: Curio: A Novel Solution for Efficient Storage and Indexing in Data Warehouses. VLDB 1999: 730-733
  22. Viswanath Poosala, Venkatesh Ganti: Fast Approximate Answers to Aggregate Queries on a Data Cube. SSDBM 1999: 24-33
  23. Jeffrey Scott Vitter, Min Wang: Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. SIGMOD Conference 1999: 193-204
  24. Wilburt Labio, Ramana Yerneni, Hector Garcia-Molina: Shrinking the Warehouse Update Window. SIGMOD Conference 1999: 383-394
  25. Yannis Kotidis, Nick Roussopoulos: DynaMat: A Dynamic View Management System for Data Warehouses. SIGMOD Conference 1999: 371-382
  26. H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava: Snakes and Sandwiches: Optimal Clustering Strategies for a Data Warehouse. SIGMOD Conference 1999: 37-48
  27. Kevin S. Beyer, Raghu Ramakrishnan: Bottom-Up Computation of Sparse and Iceberg CUBEs. SIGMOD Conference 1999: 359-370
  28. Howard J. Karloff, Milena Mihail: On the Complexity of the View-Selection Problem. PODS 1999: 167-173
  29. Stéphane Grumbach, Maurizio Rafanelli, Leonardo Tininini: Querying Aggregate Data. PODS 1999: 174-184
  30. Himanshu Gupta, Divesh Srivastava: The Data Warehouse of Newsgroups. ICDT 1999: 471-488
  31. Himanshu Gupta, Inderpal Singh Mumick: Selection of Views to Materialize Under a Maintenance Cost Constraint. ICDT 1999: 453-470
  32. Carlos A. Hurtado, Alberto O. Mendelzon, Alejandro A. Vaisman: Maintaining Data Cubes under Dimension Updates. ICDE 1999: 346-355
  33. Steven Geffner, Divyakant Agrawal, Amr El Abbadi, Terence R. Smith: Relative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes. ICDE 1999: 328-335
  34. Yuping Yang, Mukesh Singhal: Accessing Data Cubes along Complex Dimensions. DOLAP 1999: 73-78
  35. Hidetoshi Uchiyama, Kanda Runapongsa, Toby J. Teorey: A Progressive View Materialization Algorithm. DOLAP 1999: 36-41
  36. Seigo Muto, Masaru Kitsuregawa: A Dynamic Load Balancing Strategy for Parallel Datacube Computation. DOLAP 1999: 67-72
  37. Goretti K. Y. Chan, Qing Li, Ling Feng: Design and Selection of Materialized Views in a Data Warehousing Environment: A Case Study. DOLAP 1999: 42-47
  38. Anil Kumar, Vassilis J. Tsotras, Christos Faloutsos: Designing Access Methods for Bitemporal Databases. IEEE Trans. Knowl. Data Eng. 10(1): 1-20(1998)
  39. Elena Baralis, Stefano Ceri, Stefano Paraboschi: Compile-Time and Runtime Analysis of Active Behaviors. IEEE Trans. Knowl. Data Eng. 10(3): 353-370(1998)
  40. Alex G. Büchner, Maurice D. Mulvenna: Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining. SIGMOD Record 27(4): 54-61(1998)
  41. Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton: Materialized View Selection for Multidimensional Datasets. VLDB 1998: 488-499
  42. Guido Moerkotte: Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing. VLDB 1998: 476-487
  43. Yihong Zhao, Prasad Deshpande, Jeffrey F. Naughton, Amit Shukla: Simultaneous Optimization and Evaluation of Multiple Dimensional Queries. SIGMOD Conference 1998: 271-282
  44. Subbu N. Subramanian, Shivakumar Venkataraman: Cost-Based Optimization of Decision Support Queries Using Transient Views. SIGMOD Conference 1998: 319-330
  45. Yannis Kotidis, Nick Roussopoulos: An Alternative Storage Organization for ROLAP Aggregate Views Based on Cubetrees. SIGMOD Conference 1998: 249-258
  46. Prasad Deshpande, Karthikeyan Ramasamy, Amit Shukla, Jeffrey F. Naughton: Caching Multidimensional Queries Using Chunks. SIGMOD Conference 1998: 259-270
  47. Surajit Chaudhuri, Vivek R. Narasayya: AutoAdmin 'What-if' Index Analysis Utility. SIGMOD Conference 1998: 367-378
  48. John R. Smith, Chung-Sheng Li, Vittorio Castelli, Anant Jhingran: Dynamic Assembly of Views in Data Cubes. PODS 1998: 274-283
  49. Theodore Johnson: Coarse Indices for a Tape-Based Data Warehouse. ICDE 1998: 231-240
  50. Latha S. Colby, Richard L. Cole, Edward Haslam, Nasi Jazayeri, Galt Johnson, William J. McKenna, Lee Schumacher, David Wilhite: Redbrick Vista: Aggregate Computation and Management. ICDE 1998: 174-177
  51. Charu C. Aggarwal, Philip S. Yu: Online Generation of Association Rules. ICDE 1998: 402-411
  52. Dimitri Theodoratos, Timos K. Sellis: Data Warehouse Schema and Instance Design. ER 1998: 363-376
  53. Carsten Sapia, Markus Blaschka, Gabriele Höfling, Barbara Dinter: Extending the E/R Model for the Multidimensional Paradigm. ER Workshops 1998: 105-116
  54. Stijn Dekeyser, Bart Kuijpers, Jan Paredaens, Jef Wijsen: Nested Data Cubes for OLAP (Extended Abstract). ER Workshops 1998: 129-140
  55. Shin-Chung Shao: Multivariate and Multidimensional OLAP. EDBT 1998: 120-134
  56. Takeshi Fukuda, Hirofumi Matsuzawa: Parallel Processing of Multiple Aggregate Queries on Shared-Nothing Multiprocessors. EDBT 1998: 278-292
  57. Matteo Golfarelli, Stefano Rizzi: Methodological Framework for Data Warehouse Design. DOLAP 1998: 3-9
  58. Sanjay Goil, Alok N. Choudhary: High Performance Multidimensional Analysis of Large Datasets. DOLAP 1998: 34-39
  59. Barbara Dinter, Carsten Sapia, Gabriele Höfling, Markus Blaschka: The OLAP Market: State of the Art and Research Issues. DOLAP 1998: 22-27
  60. Jae-young Chang, Sang-goo Lee: Query Reformulation Using Materialized Views in Data Warehousing Environment. DOLAP 1998: 54-59
  61. Sunita Sarawagi: Indexing OLAP Data. IEEE Data Eng. Bull. 20(1): 36-43(1997)
  62. Theodore Johnson, Dennis Shasha: Some Approaches to Index Design for Cube Forest. IEEE Data Eng. Bull. 20(1): 27-35(1997)
  63. Venky Harinarayan: Issues in Interactive Aggregation. IEEE Data Eng. Bull. 20(1): 12-18(1997)
  64. Curtis E. Dyreson: Using an Incomplete Data Cube as a Summary Data Sieve. IEEE Data Eng. Bull. 20(1): 19-26(1997)
  65. Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy, Amit Shukla, Kristin Tufte, Yihong Zhao: Cubing Algorithms, Storage Estimation, and Storage and Processing Alternatives for OLAP. IEEE Data Eng. Bull. 20(1): 3-11(1997)
  66. Jian Yang, Kamalakar Karlapalem, Qing Li: Algorithms for Materialized View Design in Data Warehousing Environment. VLDB 1997: 136-145
  67. Dimitri Theodoratos, Timos K. Sellis: Data Warehouse Configuration. VLDB 1997: 126-135
  68. Kenneth A. Ross, Divesh Srivastava: Fast Computation of Sparse Datacubes. VLDB 1997: 116-125
  69. Marc Gyssens, Laks V. S. Lakshmanan: A Foundation for Multi-dimensional Databases. VLDB 1997: 106-115
  70. Christos Faloutsos, H. V. Jagadish, Nikolaos Sidiropoulos: Recovering Information from Summary Data. VLDB 1997: 36-45
  71. Surajit Chaudhuri, Vivek R. Narasayya: An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. VLDB 1997: 146-155
  72. Elena Baralis, Stefano Paraboschi, Ernest Teniente: Materialized Views Selection in a Multidimensional Database. VLDB 1997: 156-165
  73. Hans-Joachim Lenz, Arie Shoshani: Summarizability in OLAP and Statistical Data Bases. SSDBM 1997: 132-143
  74. Epaminondas Kapetanios, Moira C. Norrie: Data Mining and Modeling in Scientific Databases. SSDBM 1997: 24-27
  75. Vera Kamp, L. Sitzmann, Frank Wietek: A Spatial Data Cube Concept for Supporting Data Analysis in Environmental Epidemiology. SSDBM 1997: 100-103
  76. Nick Roussopoulos, Yannis Kotidis, Mema Roussopoulos: Cubetree: Organization of and Bulk Updates on the Data Cube. SIGMOD Conference 1997: 89-99
  77. Dallan Quass, Jennifer Widom: On-Line Warehouse View Maintenance. SIGMOD Conference 1997: 393-404
  78. Patrick E. O'Neil, Dallan Quass: Improved Query Performance with Variant Indexes. SIGMOD Conference 1997: 38-49
  79. Inderpal Singh Mumick, Dallan Quass, Barinderpal Singh Mumick: Maintenance of Data Cubes and Summary Tables in a Warehouse. SIGMOD Conference 1997: 100-111
  80. François Llirbat, Françoise Fabret, Eric Simon: Eliminating Costly Redundant Computations from SQL Trigger Executions. SIGMOD Conference 1997: 428-439
  81. Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant: Range Queries in OLAP Data Cubes. SIGMOD Conference 1997: 73-88
  82. Michael Gebhardt, Matthias Jarke, Stephan Jacobs: A Toolkit for Negotiation Support Interfaces to Multi-Dimensional Data. SIGMOD Conference 1997: 348-356
  83. Arie Shoshani: OLAP and Statistical Databases: Similarities and Differences. PODS 1997: 185-196
  84. Ching-Tien Ho, Jehoshua Bruck, Rakesh Agrawal: Partial-Sum Queries in Data Cubes Using Covering Codes. PODS 1997: 228-237
  85. Himanshu Gupta: Selection of Views to Materialize in a Data Warehouse. ICDT 1997: 98-112
  86. Wilburt Labio, Dallan Quass, Brad Adelberg: Physical Database Design for Data Warehouses. ICDE 1997: 277-288
  87. Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219
  88. Rakesh Agrawal, Ashish Gupta, Sunita Sarawagi: Modeling Multidimensional Databases. ICDE 1997: 232-243
  89. Luca Cabibbo, Riccardo Torlone: Querying Multidimensional Databases. DBPL 1997: 319-335
  90. Wolfgang Lehner, Thomas Ruf: A Redundancy-Based Optimization Approach for Aggregation in Multidimensional Scientific and Atatistical Databases. DASFAA 1997: 253-262
  91. Ming-Syan Chen, Jiawei Han, Philip S. Yu: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6): 866-883(1996)
  92. Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy: Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. VLDB 1996: 522-531
  93. Peter Scheuermann, Junho Shim, Radek Vingralek: WATCHMAN : A Data Warehouse Intelligent Cache Manager. VLDB 1996: 51-62
  94. Curtis E. Dyreson: Information Retrieval from an Incomplete Data Cube. VLDB 1996: 532-543
  95. Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:32 2009