Practical Selectivity Estimation through Adaptive Sampling.
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11@inproceedings{DBLP:conf/sigmod/LiptonNS90,
author = {Richard J. Lipton and
Jeffrey F. Naughton and
Donovan A. Schneider},
editor = {Hector Garcia-Molina and
H. V. Jagadish},
title = {Practical Selectivity Estimation through Adaptive Sampling},
booktitle = {Proceedings of the 1990 ACM SIGMOD International Conference on
Management of Data, Atlantic City, NJ, May 23-25, 1990},
publisher = {ACM Press},
year = {1990},
pages = {1-11},
ee = {http://doi.acm.org/10.1145/93597.93611, db/conf/sigmod/LiptonNS90.html},
crossref = {DBLP:conf/sigmod/90},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Recently we have proposed an adaptive, random sampling
algorithm for general query size estimation. In
earlier work we analyzed the asymptotic efficiency
and accuracy of the algorithm, in this paper we investigate
its practicality as applied to selects and joins.
First, we extend our previous analysis to provide significantly
improved bounds on the amount of sampling
necessary for a given level of accuracy. Next,
we provide "sanity bounds" to deal with queries for
which the underlying data is extremely skewed or the
query result is very small. Finally, we report on the
performance of the estimation algorithm as implemented
in a host language on a commercial relational
system. The results are encouraging, even with this
loose coupling between the estimation algorithm and the DBMS.
Copyright © 1990 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Online Version (ACM WWW Account required): Full Text in PDF Format
CDROM Version: Load the CDROM "Volume 1 Issue 2, SIGMOD '75-'92" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Hector Garcia-Molina, H. V. Jagadish (Eds.):
Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23-25, 1990.
ACM Press 1990 BibTeX
,
SIGMOD Record 19(2), June 1990
Contents
References
- [BDT83]
- Dina Bitton, David J. DeWitt, Carolyn Turbyfill:
Benchmarking Database Systems A Systematic Approach.
VLDB 1983: 8-19 BibTeX
- [Chr83a]
- Stavros Christodoulakis:
Estimating Block Transfers and Join Sizes.
SIGMOD Conference 1983: 40-54 BibTeX
- [Chr83b]
- Stavros Christodoulakis:
Estimating Block Selectivities.
Inf. Syst. 9(1): 69-79,(1984) BibTeX
- [Dem80]
- Robert Demolombe:
Estimation of the Number of Tuples Satisfying a Query Expressed in Predicate Calculus Language.
VLDB 1980: 55-63 BibTeX
- [Fed84]
- Jane Fedorowicz:
Database Evaluation Using Multiple Regression Techniques.
SIGMOD Conference 1984: 70-76 BibTeX
- [Fel68]
- ...
- [HOT88]
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Statistical Estimators for Relational Algebra Expressions.
PODS 1988: 276-287 BibTeX
- [HOT89]
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Processing Aggregate Relational Queries with Hard Time Constraints.
SIGMOD Conference 1989: 68-77 BibTeX
- [HTY82]
- Larry Kerschberg, Peter D. Ting, S. Bing Yao:
Query Optimization in Star Computer Networks.
ACM Trans. Database Syst. 7(4): 678-711(1982) BibTeX
- [KK85]
- Nabil Kamel, Roger King:
A Model of Data Distribution Based on Texture Analysis.
SIGMOD Conference 1985: 319-325 BibTeX
- [Koo80]
- Robert Kooi:
The Optimization of Queries in Relational Databases.
Ph.D. thesis, Case Western Reserve University 1980
BibTeX
- [LN89]
- Richard J. Lipton, Jeffrey F. Naughton:
Estimating the Size of Generalized Transitive Closures.
VLDB 1989: 165-171 BibTeX
- [LN90]
- Richard J. Lipton, Jeffrey F. Naughton:
Query Size Estimation by Adaptive Sampling.
PODS 1990: 40-46 BibTeX
- [LNS90]
- ...
- [Lyn88]
- Clifford A. Lynch:
Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values.
VLDB 1988: 240-251 BibTeX
- [MCS88]
- Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988) BibTeX
- [MD88]
- M. Muralikrishna, David J. DeWitt:
Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries.
SIGMOD Conference 1988: 28-36 BibTeX
- [MDL83]
- Anthony Y. Montgomery, Daryl J. D'Souza, S. B. Lee:
The Cost of Relational Algebraic Operations on Skewed Data: Estimates and Experiments.
IFIP Congress 1983: 235-241 BibTeX
- [MK85]
- ...
- [OR86]
- Frank Olken, Doron Rotem:
Simple Random Sampling from Relational Databases.
VLDB 1986: 160-169 BibTeX
- [OR89]
- Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277 BibTeX
- [PSC84]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276 BibTeX
- [SAC+79]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34 BibTeX
Referenced by
- Kaushik Chakrabarti, Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim:
Approximate Query Processing Using Wavelets.
VLDB 2000: 111-122
- Dimitrios Gunopulos, George Kollios, Vassilis J. Tsotras, Carlotta Domeniconi:
Approximating Multi-Dimensional Aggregate Range Queries over Real Attributes.
SIGMOD Conference 2000: 463-474
- Yannis E. Ioannidis, Viswanath Poosala:
Histogram-Based Approximation of Set-Valued Query-Answers.
VLDB 1999: 174-185
- Swarup Acharya, Viswanath Poosala, Sridhar Ramaswamy:
Selectivity Estimation in Spatial Databases.
SIGMOD Conference 1999: 13-24
- Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy:
Join Synopses for Approximate Query Answering.
SIGMOD Conference 1999: 275-286
- Ashraf Aboulnaga, Surajit Chaudhuri:
Self-tuning Histograms: Building Histograms Without Looking at Data.
SIGMOD Conference 1999: 181-192
- Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy:
Tracking Join and Self-Join Sizes in Limited Storage.
PODS 1999: 10-20
- H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Viswanath Poosala, Kenneth C. Sevcik, Torsten Suel:
Optimal Histograms with Quality Guarantees.
VLDB 1998: 275-286
- Michael J. Carey, Donald Kossmann:
Reducing the Braking Distance of an SQL Query Engine.
VLDB 1998: 158-169
- Yossi Matias, Jeffrey Scott Vitter, Min Wang:
Wavelet-Based Histograms for Selectivity Estimation.
SIGMOD Conference 1998: 448-459
- Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya:
Random Sampling for Histogram Construction: How much is enough?
SIGMOD Conference 1998: 436-447
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997)
- Viswanath Poosala, Yannis E. Ioannidis:
Selectivity Estimation Without the Attribute Value Independence Assumption.
VLDB 1997: 486-495
- Banchong Harangsri, John Shepherd, Anne H. H. Ngu:
Query Size Estimation Using Machine Learning.
DASFAA 1997: 97-106
- Evan P. Harris, Kotagiri Ramamohanarao:
Join Algorithm Costs Revisited.
VLDB J. 5(1): 64-84(1996)
- Gennady Antoshenkov, Mohamed Ziauddin:
Query Processing and Optimization in Oracle Rdb.
VLDB J. 5(4): 229-237(1996)
- Laks V. S. Lakshmanan, Fereidoon Sadri, Iyer N. Subramanian:
SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems.
VLDB 1996: 239-250
- Nabil I. Hachem, Chenye Bao, Steve Taylor:
Approximate Query Answering In Numerical Databases.
SSDBM 1996: 63-73
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305
- P. Krishnan, Jeffrey Scott Vitter, Balakrishna R. Iyer:
Estimating Alphanumeric Selectivity in the Presence of Wildcards.
SIGMOD Conference 1996: 282-293
- Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz:
Bifocal Sampling for Skew-Resistant Join Size Estimation.
SIGMOD Conference 1996: 271-281
- Nick Roussopoulos, Chung-Min Chen, Stephen Kelley, Alex Delis, Yannis Papakonstantinou:
The ADMS Project: View R Us.
IEEE Data Eng. Bull. 18(2): 19-28(1995)
- Yannis E. Ioannidis, Viswanath Poosala:
Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
SIGMOD Conference 1995: 233-244
- Yibei Ling, Wei Sun:
An Evaluation of Sampling-Based Size Estimation Methods for Selections in Database Systems.
ICDE 1995: 532-539
- Chengwen Liu, I-Ping Chu:
Load Balancing in Distributed Query Processing.
DASFAA 1995: 256-263
- Augustine C. Ikeji, Farshad Fotouhi:
Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique.
CIKM 1995: 145-149
- Jason Tsong-Li Wang, Gung-Wei Chirn, Thomas G. Marr, Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang:
Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results.
SIGMOD Conference 1994: 115-125
- Chung-Min Chen, Nick Roussopoulos:
Adaptive Selectivity Estimation Using Query Feedback.
SIGMOD Conference 1994: 161-172
- Jyrki Kivinen, Heikki Mannila:
The Power of Sampling in Knowledge Discovery.
PODS 1994: 77-85
- Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami:
On the Relative Cost of Sampling for Join Selectivity Estimation.
PODS 1994: 14-24
- Qiang Zhu, Per-Åke Larson:
A Query Sampling Method of Estimating Local Cost Parameters in a Multidatabase System.
ICDE 1994: 144-153
- Gennady Antoshenkov:
Query Processing in DEC Rdb: Major Issues and Future Challenges.
IEEE Data Eng. Bull. 16(4): 42-52(1993)
- Wei Sun, Yibei Ling, Naphtali Rishe, Yi Deng:
An Instant and Accurate Estimation Method for Joins and Selection in a Retrieval-Intensive Environment.
SIGMOD Conference 1993: 79-88
- Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami:
Fixed-Precision Estimation of Join Selectivity.
PODS 1993: 190-201
- Allen Van Gelder:
Multiple Join Size Estimation by Virtual Domains.
PODS 1993: 180-189
- Priti Mishra, Margaret H. Eich:
Join Processing in Relational Databases.
ACM Comput. Surv. 24(1): 63-113(1992)
- Alfons Kemper, Guido Moerkotte, Michael Steinbrunn:
Optimizing Boolean Expressions in Object-Bases.
VLDB 1992: 79-90
- Gennady Antoshenkov:
Random Sampling from Pseudo-Ranked B+ Trees.
VLDB 1992: 375-382
- Peter J. Haas, Arun N. Swami:
Sequential Sampling Procedures for Query Size Estimation.
SIGMOD Conference 1992: 341-350
- Jyrki Kivinen, Heikki Mannila:
Approximate Dependency Inference from Relations.
ICDT 1992: 86-98
- Frank Olken, Doron Rotem:
Maintenance of Materialized Views of Sampling Queries.
ICDE 1992: 632-641
- S. Seshadri, Jeffrey F. Naughton:
Sampling Issues in Parallel Database Systems.
EDBT 1992: 328-343
- Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu:
Error-Constraint COUNT Query Evaluation in Relational Databases.
SIGMOD Conference 1991: 278-287
- Frédéric Andrès, Michel Couprie, Yann Viémont:
A Multi-Environment Cost Evaluator for Parallel Database Systems.
DASFAA 1991: 126-135
- Donovan A. Schneider, David J. DeWitt:
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines.
VLDB 1990: 469-480
- Jeffrey F. Naughton, S. Seshadri:
On Estimating the Size of Projections.
ICDT 1990: 499-513
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:00 2009