ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

On the Relative Cost of Sampling for Join Selectivity Estimation.

Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami: On the Relative Cost of Sampling for Join Selectivity Estimation. PODS 1994: 14-24
@inproceedings{DBLP:conf/pods/HaasNS94,
  author    = {Peter J. Haas and
               Jeffrey F. Naughton and
               Arun N. Swami},
  title     = {On the Relative Cost of Sampling for Join Selectivity Estimation},
  booktitle = {Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, May 24-26, 1994, Minneapolis,
               Minnesota},
  publisher = {ACM Press},
  year      = {1994},
  isbn      = {0-89791-642-5},
  pages     = {14-24},
  ee        = {http://doi.acm.org/10.1145/182591.182594, db/conf/pods/pods94-14.html},
  crossref  = {DBLP:conf/pods/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We compare the cost of estimating the selectivity of a "star join" using sampling procedure t_cross to the cost of simply computing the join and obtaining the exact answer. Our bounds and approximations for the relative cost of sampling show how this cost depends on the size of the input relations, the number of input relations, and the precision criterion used by the estimation procedure. We also demonstrate the deleterious effect of dangling tuples and the mixed effect of data skew on the relative cost of sampling. These results provide insight into when sampling should or should not be used for join selectivity estimation.

Copyright © 1994 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 24-26, 1994, Minneapolis, Minnesota. ACM Press 1994, ISBN 0-89791-642-5
Contents BibTeX

Online Edition: ACM Digital Library

[Abstract and Index Terms]
[Full Text in PDF Format, 924 KB]

References

[DNS92]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 BibTeX
[HN+93a]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Fixed-Precision Estimation of Join Selectivity. PODS 1993: 190-201 BibTeX
[HN+93b]
...
[HS92a]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 BibTeX
[HS92b]
...
[HOT88]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 BibTeX
[HOT89]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77 BibTeX
[HOD91]
Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu: Error-Constraint COUNT Query Evaluation in Relational Databases. SIGMOD Conference 1991: 278-287 BibTeX
[LN89]
Richard J. Lipton, Jeffrey F. Naughton: Estimating the Size of Generalized Transitive Closures. VLDB 1989: 165-171 BibTeX
[LN90]
Richard J. Lipton, Jeffrey F. Naughton: Query Size Estimation by Adaptive Sampling. PODS 1990: 40-46 BibTeX
[LNS90]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
[OR86]
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 BibTeX
[OR89]
Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277 BibTeX
[ORX90]
Frank Olken, Doron Rotem, Ping Xu: Random Sampling from Hash Files. SIGMOD Conference 1990: 375-386 BibTeX
[Olk93]
...
[Ses92]
...

Referenced by

  1. Surajit Chaudhuri, Rajeev Motwani: On Sampling and Relational Operators. IEEE Data Eng. Bull. 22(4): 41-46(1999)
  2. Peter J. Haas, Joseph M. Hellerstein: Ripple Joins for Online Aggregation. SIGMOD Conference 1999: 287-298
  3. Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya: On Random Sampling over Joins. SIGMOD Conference 1999: 263-274
  4. Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy: Join Synopses for Approximate Query Answering. SIGMOD Conference 1999: 275-286
  5. Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz: Bifocal Sampling for Skew-Resistant Join Size Estimation. SIGMOD Conference 1996: 271-281
  6. Peter J. Haas, Arun N. Swami: Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics. ICDE 1995: 522-531
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:34:09 2009