ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

An Instant and Accurate Estimation Method for Joins and Selection in a Retrieval-Intensive Environment.

Wei Sun, Yibei Ling, Naphtali Rishe, Yi Deng: An Instant and Accurate Estimation Method for Joins and Selection in a Retrieval-Intensive Environment. SIGMOD Conference 1993: 79-88
@inproceedings{DBLP:conf/sigmod/SuLRD93,
  author    = {Wei Sun and
               Yibei Ling and
               Naphtali Rishe and
               Yi Deng},
  editor    = {Peter Buneman and
               Sushil Jajodia},
  title     = {An Instant and Accurate Estimation Method for Joins and Selection
               in a Retrieval-Intensive Environment},
  booktitle = {Proceedings of the 1993 ACM SIGMOD International Conference on
               Management of Data, Washington, D.C., May 26-28, 1993},
  publisher = {ACM Press},
  year      = {1993},
  pages     = {79-88},
  ee        = {http://doi.acm.org/10.1145/170035.170055, db/conf/sigmod/SuLRD93.html},
  crossref  = {DBLP:conf/sigmod/93},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

This paper proposes a novel strategy for estimating the size of the resulting relation after an equi-join and selection using a regression model. An approximating series representing the underlying data distribution and dependency is derived from the actual data. The proposed method provides an instant and accurate size estimation by performing an evaluation of the series, with no run-time overheads in page faults and space, and with negligible CPU overhead. In contrast, the popular sampling methods incur run-time overheads in page faults (for sampling), CPU time and space. These overheads of sampling methods increase the response time of processing a query. The results of a comprehensive experimental study are also reported, which demonstrate that the estimation accuracy by the proposed method is comparable with that of the sampling methods which are believed to provide the most accurate estimation. The proposed method seems ideal for retrieval-intensive database and information systems. Since the overheads involved in deriving the approximating series are fairly moderate, we believe that this method is also an extremely competent method when moderate or periodical updates are present.

Copyright © 1993 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Peter Buneman, Sushil Jajodia (Eds.): Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 26-28, 1993. ACM Press 1993 BibTeX , SIGMOD Record 22(2), June 1993
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1026 KB]

References

[1]
...
[2]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) BibTeX
[3]
Stavros Christodoulakis: Estimating Block Transfers and Join Sizes. SIGMOD Conference 1983: 40-54 BibTeX
[4]
Stavros Christodoulakis: Implications of Certain Assumptions in Database Performance Evaluation. ACM Trans. Database Syst. 9(2): 163-186(1984) BibTeX
[5]
...
[6]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 BibTeX
[7]
Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu: Error-Constraint COUNT Query Evaluation in Relational Databases. SIGMOD Conference 1991: 278-287 BibTeX
[8]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 BibTeX
[9]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77 BibTeX
[10]
Wen-Chi Hou, Gultekin Özsoyoglu: Statistical Estimators for Aggregate Relational Algebra Queries. ACM Trans. Database Syst. 16(4): 600-654(1991) BibTeX
[11]
Matthias Jarke, Jürgen Koch: Query Optimization in Database Systems. ACM Comput. Surv. 16(2): 111-152(1984) BibTeX
[12]
...
[13]
...
[14]
Ezio Lefons, Alberto Silvestri, Filippo Tangorra: An Analytic Approach to Statistical Databases. VLDB 1983: 260-274 BibTeX
[15]
...
[16]
Yibei Ling, Wei Sun: A Suppletment to Sampling-Based Methods for Query Size Estimation in a Database System. SIGMOD Record 21(4): 12-15(1992) BibTeX
[17]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
[18]
...
[19]
Clifford A. Lynch: Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values. VLDB 1988: 240-251 BibTeX
[20]
...
[21]
Michael V. Mannino, Paicheng Chu, Thomas Sager: Statistical Profile Estimation in Database Systems. ACM Comput. Surv. 20(3): 191-221(1988) BibTeX
[22]
...
[23]
Tommaso Mostardi: Estimating the size of relational SP-Theta-J operation results: an analytical approach. Inf. Syst. 15(5): 591-601(1990) BibTeX
[24]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 BibTeX
[25]
...
[26]
...
[27]
S. Bing Yao: Approximating the Number of Accesses in Database Organizations. Commun. ACM 20(4): 260-261(1977) BibTeX
[28]
Clement T. Yu, C. C. Chang: Distributed Query Processing. ACM Comput. Surv. 16(4): 399-433(1984) BibTeX

Referenced by

  1. Arnd Christian König, Gerhard Weikum: Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation. VLDB 1999: 423-434
  2. Ju-Hong Lee, Deok-Hwan Kim, Chin-Wan Chung: Multi-dimensional Selectivity Estimation Using Compressed Histogram Information. SIGMOD Conference 1999: 205-214
  3. M. Seetha Lakshmi, Shaoyu Zhou: Selectivity Estimation in Extensible Databases - A Neural Network Approach. VLDB 1998: 623-627
  4. Elisa Bertino, Paola Foscoli: On Modeling Cost Functions for Object-Oriented Databases. IEEE Trans. Knowl. Data Eng. 9(3): 500-508(1997)
  5. Viswanath Poosala, Yannis E. Ioannidis: Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB 1997: 486-495
  6. Banchong Harangsri, John Shepherd, Anne H. H. Ngu: Query Size Estimation Using Machine Learning. DASFAA 1997: 97-106
  7. Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
  8. Nabil I. Hachem, Chenye Bao, Steve Taylor: Approximate Query Answering In Numerical Databases. SSDBM 1996: 63-73
  9. Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita: Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conference 1996: 294-305
  10. Nick Roussopoulos, Chung-Min Chen, Stephen Kelley, Alex Delis, Yannis Papakonstantinou: The ADMS Project: View R Us. IEEE Data Eng. Bull. 18(2): 19-28(1995)
  11. Yibei Ling, Wei Sun: An Evaluation of Sampling-Based Size Estimation Methods for Selections in Database Systems. ICDE 1995: 532-539
  12. Serge Abiteboul, Richard Hull, Victor Vianu: Foundations of Databases. Addison-Wesley 1995, ISBN 0-201-53771-0
    Contents
  13. Chung-Min Chen, Nick Roussopoulos: Adaptive Selectivity Estimation Using Query Feedback. SIGMOD Conference 1994: 161-172
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:13 2009