Digital Symposium Collection 2000  

 
 
 
 
 
 

 




















Querying Aggregate Data

Stéphane Grumbach, Maurizio Rafanelli, and Leonardo Tininini

  View Paper (PDF)  

Return to Views/Queries

Abstract
We introduce a first-order language with real polynomial arithmetic and aggregation operators (count, iterated sum and multiply), which is well suited for the definition of aggregate queries involving complex statistical functions. It offers a good trade-off between expressive power and complexity, with a tractable data complexity. Interestingly, some fundamental properties of first-order with real arithmetic are preserved in the presence of aggregates. In particular, there is an effective quantifier elimination for formulae with aggregation. We consider the problem of querying data that has already been aggregated in aggregate views, and focus on queries with an aggregation over a conjunctive query. Our main conceptual contribution is the introduction of a new equivalence relation among conjunctive queries, the isomorphism modulo a product. We prove that the equivalence of aggregate queries such as for instance averages reduces to it. Deciding if two queries are isomorphic modulo a product is shown to be NP-complete. We then show that the problem of complete rewriting of count queries using count views is also NP-complete. Finally, we introduce new rewriting techniques based on the isomorphism modulo a product to recover the values of counts by complex arithmetical computation from the views. We conclude by showing how these techniques can be used to perform automatic aggregation.


References

Note: References link to DBLP on the Web.

[BL96]
Michael Benedikt , Leonid Libkin : On the Structure of Queries in Constraint Query Languages. LICS 1996 : 25-34
[CV93]
Surajit Chaudhuri , Moshe Y. Vardi : Optimization of Real Conjunctive Queries. PODS 1993 : 59-70
[DMM94]
...
[Dri82]
...
[GBLP96]
Jim Gray , Adam Bosworth , Andrew Layman , Hamid Pirahesh : Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996 : 152-159
[Gho86]
Sakti P. Ghosh : Statistical Relational Tables for Statistical Database Management. TSE 12(12) : 1106-1116(1986)
[GHQ95]
Ashish Gupta , Venky Harinarayan , Dallan Quass : Aggregate-Query Processing in Data Warehousing Environments. VLDB 1995 : 358-369
[GLMW96]
...
[HRU96]
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman : Implementing Data Cubes Efficiently. SIGMOD Conf. 1996 : 205-216
[IS96]
Oscar H. Ibarra , Jianwen Su : On the Containment and Equivalence of Database Queries with Linear Constraints. PODS 1997 : 32-43
[KKR90]
Paris C. Kanellakis , Gabriel M. Kuper , Peter Z. Revesz : Constraint Query Languages. PODS 1990 : 299-313
[LM96]
Alon Y. Levy , Inderpal Singh Mumick : Reasoning with Aggregation Constraints. EDBT 1996 : 514-534
[LMSS95]
Alon Y. Levy , Alberto O. Mendelzon , Yehoshua Sagiv , Divesh Srivastava : Answering Queries Using Views. PODS 1995 : 95-104
[LS97]
Hans-J. Lenz , Arie Shoshani : Summarizability in OLAP and Statistical Data Bases. SSDBM 1997 : 132-143
[LW97]
Leonid Libkin , Limsoon Wong : On the Power of Aggregation in Relational Query Languages. DBPL 1997 : 260-280
[NSS98]
Werner Nutt , Yehoshua Sagiv , Sara Shurin : Deciding Equivalences Among Aggregate Queries. PODS 1998 : 214-223
[OOM87]
Gultekin Özsoyoglu , Z. Meral Özsoyoglu , Victor Matos : Extending Relational Algebra and Relational Calculus with Set-Valued Attributes and Aggregate Functions. TODS 12(4) : 566-592(1987)
[RBT96]
Maurizio Rafanelli , Antonia Bezenchek , Leonardo Tininini : The Aggregate Data Problem: A System for their Definition and Management. SIGMOD Record 25(4) : 8-13(1996)
[Ren92]
James Renegar : On the Computational Complexity and Geometry of the First-Order Theory of the Reals, Part I: Introduction. Preliminaries. The Geometry of Semi-Algebraic Sets. The Decision Problem for the Existential Theory of the Reals. JSC 13(3) : 255-300(1992)
[RR93]
Maurizio Rafanelli , Fabrizio L. Ricci : Mefisto: A Functional Model for Statistical Entities. TKDE 5(4) : 670-681(1993)
[RSSS98]
Kenneth A. Ross , Divesh Srivastava , Peter J. Stuckey , S. Sudarshan : Foundations of Aggregation Constraints. TCS 193(1-2) : 149-179(1998)
[SDJL96]
Divesh Srivastava , Shaul Dar , H. V. Jagadish , Alon Y. Levy : Answering Queries with Aggregation Using Views. VLDB 1996 : 318-329
[Sho97]
Arie Shoshani : OLAP and Statistical Databases: Similarities and Differences. PODS 1997 : 185-196
[Su83]
...
[SW85]
Arie Shoshani , Harry K. T. Wong : Statistical and Scientific Database Issues. TSE 11(10) : 1040-1047(1985)

Referenced by

  1. Michael Benedikt , Leonid Libkin : Exact and Approximate Aggregation in Constraint Query. PODS 1999 : 102-113

BIBTEX

@inproceedings{DBLP:conf/pods/GrumbachRT99,
  author    = {St{\'e}phane Grumbach and
                Maurizio Rafanelli and
                Leonardo Tininini},
   title     = {Querying Aggregate Data},
   booktitle = {Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium
                on Principles of Database Systems, May 31 - June 2, 1999, Philadelphia,
                Pennsylvania},
   publisher = {ACM Press},
   year      = {1999},
   isbn      = {1-58113-062-7},
   pages     = {174-184},
   crossref  = {DBLP:conf/pods/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM