ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Statistical Estimators for Relational Algebra Expressions.

Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287
@inproceedings{DBLP:conf/pods/HouOT88,
  author    = {Wen-Chi Hou and
               Gultekin {\"O}zsoyoglu and
               Baldeo K. Taneja},
  title     = {Statistical Estimators for Relational Algebra Expressions},
  booktitle = {Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, March 21-23, 1988, Austin,
               Texas},
  publisher = {ACM},
  year      = {1988},
  isbn      = {0-89791-263-2},
  pages     = {276-287},
  ee        = {http://doi.acm.org/10.1145/308386.308455, db/conf/pods/HouOT88.html},
  crossref  = {DBLP:conf/pods/88},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Present database systems process all the data related to a query before giving out responses. As a result, the size of the data to be processed becomes excessive for realtime/time-constrained environments. A new methodology is needed to cut down systematically the time to process the data involved in processing the query. To this end, we propose to use data samples and construct an approximate synthetic response to a given query.

In this paper, we consider only COUNT(E) type queries, where E is an arbitrary relational algebra expression. We make no assumptions about the distribution of attribute values and ordering of tuples in the input relations, and propose consistent and unbiased estimators for arbitrary COUNT(E) type queries. We design a sampling plan based on the cluster sampling method to improve the utilization of sampled data and to reduce the cost of sampling. We also evaluate the performance of the proposed estimators.

Copyright © 1988 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, March 21-23, 1988, Austin, Texas. ACM 1988, ISBN 0-89791-263-2
Contents BibTeX

Online Edition: ACM Digital Library


References

[Coch 77]
...
[Chri 83]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) BibTeX
[Devo 84]
...
[DFHO 86]
...
[Good 49]
...
[HoOT 87]
...
[Liu 68]
...
[Morg 81]
...
[Olke 86]
...
[OlkR 86]
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 BibTeX
[Ross 80]
...
[Rowe 85]
Neil C. Rowe: Antisampling for Estimation: An Overview. IEEE Trans. Software Eng. 11(10): 1081-1091(1985) BibTeX
[SMO 86]
...
[Sukh 84]
...

Referenced by

  1. David Gross, Michel de Rougemont: Uniform Generation in Spatial Constraint Databases and Applications. PODS 2000: 254-259
  2. Moses Charikar, Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya: Towards Estimation Error Guarantees for Distinct Values. PODS 2000: 268-279
  3. Kian-Lee Tan, Cheng Hian Goh, Beng Chin Ooi: Online Feedback for Nested Aggregate Queries with Multi-Threading. VLDB 1999: 18-29
  4. Peter J. Haas, Joseph M. Hellerstein: Ripple Joins for Online Aggregation. SIGMOD Conference 1999: 287-298
  5. Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy: Join Synopses for Approximate Query Answering. SIGMOD Conference 1999: 275-286
  6. Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy: Tracking Join and Self-Join Sizes in Limited Storage. PODS 1999: 10-20
  7. Joseph M. Hellerstein: Optimization Techniques for Queries with Expensive Methods. ACM Trans. Database Syst. 23(2): 113-157(1998)
  8. Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya: Random Sampling for Histogram Construction: How much is enough? SIGMOD Conference 1998: 436-447
  9. Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik: The New Jersey Data Reduction Report. IEEE Data Eng. Bull. 20(4): 3-45(1997)
  10. Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang: Online Aggregation. SIGMOD Conference 1997: 171-182
  11. Banchong Harangsri, John Shepherd, Anne H. H. Ngu: Query Size Estimation Using Machine Learning. DASFAA 1997: 97-106
  12. Nabil I. Hachem, Chenye Bao, Steve Taylor: Approximate Query Answering In Numerical Databases. SSDBM 1996: 63-73
  13. Joseph M. Hellerstein, Jeffrey F. Naughton: Query Execution Techniques for Caching Expensive Methods. SIGMOD Conference 1996: 423-434
  14. Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz: Bifocal Sampling for Skew-Resistant Join Size Estimation. SIGMOD Conference 1996: 271-281
  15. Gultekin Özsoyoglu, Sujatha Guru Swamy, Kaizheng Du, Wen-Chi Hou: Time-Constrained Query Processing in CASE-DB. IEEE Trans. Knowl. Data Eng. 7(6): 865-884(1995)
  16. Nick Roussopoulos, Chung-Min Chen, Stephen Kelley, Alex Delis, Yannis Papakonstantinou: The ADMS Project: View R Us. IEEE Data Eng. Bull. 18(2): 19-28(1995)
  17. Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes: Sampling-Based Estimation of the Number of Distinct Values of an Attribute. VLDB 1995: 311-322
  18. Chengwen Liu, I-Ping Chu: Load Balancing in Distributed Query Processing. DASFAA 1995: 256-263
  19. Chung-Min Chen, Nick Roussopoulos: Adaptive Selectivity Estimation Using Query Feedback. SIGMOD Conference 1994: 161-172
  20. Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami: On the Relative Cost of Sampling for Join Selectivity Estimation. PODS 1994: 14-24
  21. Wen-Chi Hou, Gultekin Özsoyoglu: Processing Time-Constrained Aggregate Queries in CASE-DB. ACM Trans. Database Syst. 18(2): 224-261(1993)
  22. Wei Sun, Yibei Ling, Naphtali Rishe, Yi Deng: An Instant and Accurate Estimation Method for Joins and Selection in a Retrieval-Intensive Environment. SIGMOD Conference 1993: 79-88
  23. Joseph M. Hellerstein, Michael Stonebraker: Predicate Migration: Optimizing Queries with Expensive Predicates. SIGMOD Conference 1993: 267-276
  24. Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Fixed-Precision Estimation of Join Selectivity. PODS 1993: 190-201
  25. Allen Van Gelder: Multiple Join Size Estimation by Virtual Domains. PODS 1993: 180-189
  26. Frank Olken, Doron Rotem: Sampling from Spatial Databases. ICDE 1993: 199-208
  27. Gennady Antoshenkov: Random Sampling from Pseudo-Ranked B+ Trees. VLDB 1992: 375-382
  28. Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350
  29. Gultekin Özsoyoglu, Kaizheng Du, Sujatha Guru Swamy, Wen-Chi Hou: Processing Real-Time, Non-Aggregate Queries with Time-Constraints in CASE-DB. ICDE 1992: 410-417
  30. Frank Olken, Doron Rotem: Maintenance of Materialized Views of Sampling Queries. ICDE 1992: 632-641
  31. S. Seshadri, Jeffrey F. Naughton: Sampling Issues in Parallel Database Systems. EDBT 1992: 328-343
  32. Wen-Chi Hou, Gultekin Özsoyoglu: Statistical Estimators for Aggregate Relational Algebra Queries. ACM Trans. Database Syst. 16(4): 600-654(1991)
  33. David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider: An Evaluation of Non-Equijoin Algorithms. VLDB 1991: 443-452
  34. Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu: Error-Constraint COUNT Query Evaluation in Relational Databases. SIGMOD Conference 1991: 278-287
  35. Gultekin Özsoyoglu, Wen-Chi Hou, Adegbemiga Ola: Database Systems for Programmable Logic Controllers. SSDBM 1990: 183-199
  36. Frank Olken, Doron Rotem: Random Sampling from Database Files: A Survey. SSDBM 1990: 92-111
  37. Frank Olken, Doron Rotem, Ping Xu: Random Sampling from Hash Files. SIGMOD Conference 1990: 375-386
  38. Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11
  39. Richard J. Lipton, Jeffrey F. Naughton: Query Size Estimation by Adaptive Sampling. PODS 1990: 40-46
  40. Jeffrey F. Naughton, S. Seshadri: On Estimating the Size of Projections. ICDT 1990: 499-513
  41. Saumya K. Debray, Nai-Wei Lin: Static Estimation of Query Sizes in Horn Programs. ICDT 1990: 514-528
  42. Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277
  43. Richard J. Lipton, Jeffrey F. Naughton: Estimating the Size of Generalized Transitive Closures. VLDB 1989: 165-171
  44. Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77
  45. Jaideep Srivastava, Doron Rotem: Precision-Time Tradeoffs: A Paradigm for Processing Statistical Queries on Databases. SSDBM 1988: 226-245
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:33:55 2009