Digital Symposium Collection 2000  

 
 
 
 
 
 

 





















Histogram-Based Approximation of Set-Valued Query-Answers

Yannis E. Ioannidis and Viswanath Poosala

  View Paper (PDF)  

Return to Data Mining Algorithms

Abstract
Answering queries approximately has recently been proposed as a way to reduce query response times in on-line decision support systems, when the precise answer is not necessary or early feedback is helpful. Most of the work in this area uses sampling-based techniques and handles aggregate queries, ignoring queries that return relations as answers. In this paper, we extend the scope of approximate query answering to general queries. We propose a novel and intuitive error measure for quantifying the error in an approximate query answer, which can be a multiset in general. We also study the use of histograms in approximate query answering as an alternative to sampling. In that direction, we develop a histogram algebra and demonstrate how complex SQL queries on a database may be translated into algebraic operations on the corresponding histograms. Finally, we present the results of an initial set of experiments where various types of histograms and sampling are compared with respect to their effectiveness in approximate query answering as captured by the introduced error measure. The results indicate that the MaxDiff(V,A) histograms provide quality approximations for both set-valued and aggregate queries, while sampling is competitive mainly for aggregate queries with no join operators.


References

Note: References link to DBLP on the Web.

[1]
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy : Join Synopses for Approximate Query Answering. SIGMOD Conference 1999 : 275-286
[2]
Peter Buneman , Susan B. Davidson , Aaron Watters : A Semantics for Complex Objects and Approximate Queries. PODS 1988 : 305-314
[3]
Chung-Min Chen , Nick Roussopoulos : Adaptive Selectivity Estimation Using Query Feedback. SIGMOD Conference 1994 : 161-172
[4]
Phillip B. Gibbons , Yossi Matias : New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998 : 331-342
[5]
Phillip B. Gibbons , Yossi Matias , Viswanath Poosala : Fast Incremental Maintenance of Approximate Histograms. VLDB 1997 : 466-475
[6]
Joseph M. Hellerstein , Peter J. Haas , Helen Wang : Online Aggregation. SIGMOD Conference 1997 : 171-182
[7]
...
[8]
...
[9]
Robert Kooi : The Optimization of Queries in Relational Databases. Ph.D. thesis, Case Western Reserve University 1980
[10]
Richard J. Lipton , Jeffrey F. Naughton , Donovan A. Schneider : Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990 : 1-11
[11]
Amihai Motro : Query Generalization: A Method for Interpreting Null Answers. Expert Database Workshop 1984 : 597-616
[12]
Gultekin Özsoyoglu , Kaizheng Du , Sujatha Guru Swamy , Wen-Chi Hou : Processing Real-Time, Non-Aggregate Queries with Time-Constraints in CASE-DB. ICDE 1992 : 410-417
[13]
Gregory Piatetsky-Shapiro , Charles Connell : Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD Conference 1984 : 256-276
[14]
Viswanath Poosala , Venkatesh Ganti : Fast Approximate Answers to Aggregate Queries on a Data Cube. SSDBM 1999 : 24-33
[15]
Viswanath Poosala , Venkatesh Ganti : Fast Approximate Query Answering Using Precomputed Statistics. ICDE 1999 : 252
[16]
Viswanath Poosala , Yannis E. Ioannidis : Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB 1997 : 486-495
[17]
Viswanath Poosala , Yannis E. Ioannidis , Peter J. Haas , Eugene J. Shekita : Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conf. 1996 : 294-305
[18]
...
[19]
Jeffrey Scott Vitter , Min Wang , Balakrishna R. Iyer : Data Cube Approximation and Histograms via Wavelets. CIKM 1998 : 96-104
[20]
Susan V. Vrbsky , Jane W.-S. Liu : APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers. TKDE 5(6) : 1056-1068(1993)
[21]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949

Referenced by

  1. Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala : Aqua: A Fast Decision Support Systems Using Approximate Query Answers. VLDB 1999 : 754-757

BIBTEX

@inproceedings{DBLP:conf/vldb/IoannidisP99,
  author    = {Yannis E. Ioannidis and
                Viswanath Poosala},
   editor    = {Malcolm P. Atkinson and
                Maria E. Orlowska and
                Patrick Valduriez and
                Stanley B. Zdonik and
                Michael L. Brodie},
   title     = {Histogram-Based Approximation of Set-Valued Query-Answers},
   booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
                Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
                UK},
   publisher = {Morgan Kaufmann},
   year      = {1999},
   isbn      = {1-55860-615-5},
   pages     = {174-185},
   crossref  = {DBLP:conf/vldb/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM