ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Fixed-Precision Estimation of Join Selectivity.

Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Fixed-Precision Estimation of Join Selectivity. PODS 1993: 190-201
@inproceedings{DBLP:conf/pods/HaasNSS93,
  author    = {Peter J. Haas and
               Jeffrey F. Naughton and
               S. Seshadri and
               Arun N. Swami},
  title     = {Fixed-Precision Estimation of Join Selectivity},
  booktitle = {Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, May 25-28, 1993, Washington,
               DC},
  publisher = {ACM Press},
  year      = {1993},
  isbn      = {0-89791-593-3},
  pages     = {190-201},
  ee        = {http://doi.acm.org/10.1145/153850.153875, db/conf/pods/HaasNSS93.html},
  crossref  = {DBLP:conf/pods/93},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We compare the performance of sampling-based procedures for estimation of the selectivity of an equijoin. While some of the procedures have been proposed in the database sampling literature, their relative performance has never been analyzed. A main result of this paper is a partial ordering that compares the variability of the estimators for the different procedures after an arbitrary fixed number of sampling steps. Prior to the current work, it was also unknown whether these fixed-step estimation procedures can be extended to asymptotically efficient fixed-precision estimation procedures. Our second main result is a general method for such an extension and a proof that the method is valid for all the estimation procedures under consideration. Finally, we show that, under reasonable assumptions on sampling costs, the partial ordering on the variability of the fixed-step estimation procedures implies a partial ordering on the cost of the corresponding fixed-precision estimation procedures. These results lead to a new algorithm for fixed-precision estimation of the selectivity of an equijoin. The algorithm appears to be the best available when there are no indices on the join key. Our results can be extended to general select-join queries.

Copyright © 1993 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 25-28, 1993, Washington, DC. ACM Press 1993, ISBN 0-89791-593-3
Contents BibTeX

Online Edition: ACM Digital Library

[Abstract and Index Terms]
[Full Text in PDF Format, 1119 KB]

Journal Version

Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Selectivity and Cost Estimation for Joins Based on Random Sampling. J. Comput. Syst. Sci. 52(3): 550-569(1996) BibTeX

References

[1]
...
[2]
...
[3]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 BibTeX
[4]
...
[5]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 BibTeX
[6]
...
[7]
...
[8]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 BibTeX
[9]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77 BibTeX
[10]
Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu: Error-Constraint COUNT Query Evaluation in Relational Databases. SIGMOD Conference 1991: 278-287 BibTeX
[11]
Richard J. Lipton, Jeffrey F. Naughton: Estimating the Size of Generalized Transitive Closures. VLDB 1989: 165-171 BibTeX
[12]
Richard J. Lipton, Jeffrey F. Naughton: Query Size Estimation by Adaptive Sampling. PODS 1990: 40-46 BibTeX
[13]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
[14]
...
[15]
...

Referenced by

  1. Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy: Tracking Join and Self-Join Sizes in Limited Storage. PODS 1999: 10-20
  2. Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz: Bifocal Sampling for Skew-Resistant Join Size Estimation. SIGMOD Conference 1996: 271-281
  3. Peter J. Haas, Arun N. Swami: Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics. ICDE 1995: 522-531
  4. Arie Segev, Abhirup Chatterjee: Supporting Statistical Operations in Extensible Databases: A Case Study. SSDBM 1994: 54-63
  5. Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami: On the Relative Cost of Sampling for Join Selectivity Estimation. PODS 1994: 14-24
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:34:08 2009