New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
Phillip B. Gibbons, Yossi Matias:
New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
SIGMOD Conference 1998: 331-342@inproceedings{DBLP:conf/sigmod/GibbonsM98,
author = {Phillip B. Gibbons and
Yossi Matias},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {New Sampling-Based Summary Statistics for Improving Approximate
Query Answers},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-995-5},
pages = {331-342},
ee = {http://doi.acm.org/10.1145/276304.276334, db/conf/sigmod/GibbonsM98.html},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
In large data recording and warehousing environments, it is
often advantageous to provide fast, approximate answers to
queries, whenever possible. Before DBMSs providing
highly-accurate approximate answers can become a reality, many
new techniques for summarizing data and for estimating
answers from summarized data must be developed. This paper
introduces two new sampling-based summary statistics,
concise samples and counting samples, and presents new
techniques for their fast incremental maintenance regardless
of the data distribution. We quantify their advantages over
standard sample views in terms of the number of additional
sample points for the same view size, and hence in providing
more accurate query answers. Finally, we consider their
application to providing fast approximate answers to hot list
queries. Our algorithms maintain their accuracy in the
presence of ongoing insertions to the data warehouse.
Copyright © 1998 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...
Online Version (ACM WWW Account required): Full Text in PDF Format
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Laura M. Haas, Ashutosh Tiwary (Eds.):
SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA.
ACM Press 1998, ISBN 0-89791-995-5 BibTeX
,
SIGMOD Record 27(2),
June 1998
Contents
[Abstract]
[Full Text (Postscript)]
References
- [AMS96]
- ...
- [Ant92]
- Gennady Antoshenkov:
Random Sampling from Pseudo-Ranked B+ Trees.
VLDB 1992: 375-382 BibTeX
- [AS94]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499 BibTeX
- [AZ96]
- Gennady Antoshenkov, Mohamed Ziauddin:
Query Processing and Optimization in Oracle Rdb.
VLDB J. 5(4): 229-237(1996) BibTeX
- [BDF+97]
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997) BibTeX
- [BM96]
- Roberto J. Bayardo Jr., Daniel P. Miranker:
Processing Queries for First Few Answers.
CIKM 1996: 45-52 BibTeX
- [BMUT97]
- Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur:
Dynamic Itemset Counting and Implication Rules for Market Basket Data.
SIGMOD Conference 1997: 255-264 BibTeX
- [FJS97]
- Christos Faloutsos, H. V. Jagadish, Nikolaos Sidiropoulos:
Recovering Information from Summary Data.
VLDB 1997: 36-45 BibTeX
- [Fla85]
- Philippe Flajolet:
Approximate Counting: A Detailed Analysis.
BIT 25(1): 113-134(1985) BibTeX
- [FM83]
- Philippe Flajolet, G. Nigel Martin:
Probabilistic Counting.
FOCS 1983: 76-82 BibTeX
- [FM85]
- Philippe Flajolet, G. Nigel Martin:
Probabilistic Counting Algorithms for Data Base Applications.
J. Comput. Syst. Sci. 31(2): 182-209(1985) BibTeX
- [GM95]
- ...
- [GM97]
- ...
- [GMP97a]
- ...
- [GMP97b]
- Phillip B. Gibbons, Yossi Matias, Viswanath Poosala:
Fast Incremental Maintenance of Approximate Histograms.
VLDB 1997: 466-475 BibTeX
- [GPA+98]
- ...
- [HHW97]
- Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang:
Online Aggregation.
SIGMOD Conference 1997: 171-182 BibTeX
- [HK95]
- ...
- [HNSS95]
- Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes:
Sampling-Based Estimation of the Number of Distinct Values of an Attribute.
VLDB 1995: 311-322 BibTeX
- [IC93]
- Yannis E. Ioannidis, Stavros Christodoulakis:
Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results.
ACM Trans. Database Syst. 18(4): 709-748(1993) BibTeX
- [Ioa93]
- Yannis E. Ioannidis:
Universality of Serial Histograms.
VLDB 1993: 256-267 BibTeX
- [IP95]
- Yannis E. Ioannidis, Viswanath Poosala:
Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
SIGMOD Conference 1995: 233-244 BibTeX
- [Mat92]
- ...
- [Mor78]
- Robert Morris:
Counting Large Numbers of Events in Small Registers.
Commun. ACM 21(10): 840-842(1978) BibTeX
- [MSY96]
- ...
- [MVN93]
- ...
- [MVY94]
- ...
- [OR89]
- Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277 BibTeX
- [OR92]
- Frank Olken, Doron Rotem:
Maintenance of Materialized Views of Sampling Queries.
ICDE 1992: 632-641 BibTeX
- [PIHS96]
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305 BibTeX
- [Pre97]
- ...
- [Vit85]
- Jeffrey Scott Vitter:
Random Sampling with a Reservoir.
ACM Trans. Math. Softw. 11(1): 37-57(1985) BibTeX
- [VL93]
- Susan V. Vrbsky, Jane W.-S. Liu:
APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers.
IEEE Trans. Knowl. Data Eng. 5(6): 1056-1068(1993) BibTeX
- [WVZT90]
- Kyu-Young Whang, Brad T. Vander Zanden, Howard M. Taylor:
A Linear-Time Probabilistic Counting Algorithm for Database Applications.
ACM Trans. Database Syst. 15(2): 208-229(1990) BibTeX
Referenced by
- Francesco Buccafurri, Filippo Furfaro, Domenico Saccà:
Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach.
ICDT 2001: 390-404
- Chris Olston, Jennifer Widom:
Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data.
VLDB 2000: 144-155
- Venkatesh Ganti, Mong-Li Lee, Raghu Ramakrishnan:
ICICLES: Self-Tuning Samples for Approximate Query Answering.
VLDB 2000: 176-187
- Christopher R. Palmer, Christos Faloutsos:
Density Biased Sampling: An Improved Method for Data Mining and Clustering.
SIGMOD Conference 2000: 82-92
- Dimitrios Gunopulos, George Kollios, Vassilis J. Tsotras, Carlotta Domeniconi:
Approximating Multi-Dimensional Aggregate Range Queries over Real Attributes.
SIGMOD Conference 2000: 463-474
- Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala:
Congressional Samples for Approximate Answering of Group-By Queries.
SIGMOD Conference 2000: 487-498
- Vijayshankar Raman, Bhaskaran Raman, Joseph M. Hellerstein:
Online Dynamic Reordering for Interactive Data Processing.
VLDB 1999: 709-720
- Arnd Christian König, Gerhard Weikum:
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation.
VLDB 1999: 423-434
- H. V. Jagadish, Olga Kapitskaia, Raymond T. Ng, Divesh Srivastava:
Multi-Dimensional Substring Selectivity Estimation.
VLDB 1999: 387-398
- Yannis E. Ioannidis, Viswanath Poosala:
Histogram-Based Approximation of Set-Valued Query-Answers.
VLDB 1999: 174-185
- Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala:
Aqua: A Fast Decision Support Systems Using Approximate Query Answers.
VLDB 1999: 754-757
- Viswanath Poosala, Venkatesh Ganti:
Fast Approximate Answers to Aggregate Queries on a Data Cube.
SSDBM 1999: 24-33
- Peter J. Haas:
Techniques for Online Exploration of Large Object-Relational Datasets.
SSDBM 1999: 4-12
- Jeffrey Scott Vitter, Min Wang:
Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets.
SIGMOD Conference 1999: 193-204
- Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G. Lindsay:
Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets.
SIGMOD Conference 1999: 251-262
- Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy:
Join Synopses for Approximate Query Answering.
SIGMOD Conference 1999: 275-286
- Michael Benedikt, Leonid Libkin:
Exact and Approximate Aggregation in Constraint Query.
PODS 1999: 102-113
- Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy:
Tracking Join and Self-Join Sizes in Limited Storage.
PODS 1999: 10-20
- Yossi Matias, Jeffrey Scott Vitter, Min Wang:
Wavelet-Based Histograms for Selectivity Estimation.
SIGMOD Conference 1998: 448-459
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:44 2009