Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
Yannis E. Ioannidis, Viswanath Poosala:
Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
SIGMOD Conference 1995: 233-244@inproceedings{DBLP:conf/sigmod/IoannidisP95,
author = {Yannis E. Ioannidis and
Viswanath Poosala},
editor = {Michael J. Carey and
Donovan A. Schneider},
title = {Balancing Histogram Optimality and Practicality for Query Result
Size Estimation},
booktitle = {Proceedings of the 1995 ACM SIGMOD International Conference on
Management of Data, San Jose, California, May 22-25, 1995},
publisher = {ACM Press},
year = {1995},
pages = {233-244},
ee = {http://doi.acm.org/10.1145/223784.223841, db/conf/sigmod/sigmod95-18.html},
crossref = {DBLP:conf/sigmod/95},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Many current database systems use histograms to approximate the frequency
distribution of values in the attributes of relations and based on them
estimate query result sizes and access plan costs. In choosing among the
various histograms, one has to balance between two conflicting goals: optimality,
so that generated estimates have the least error, and practicality, so that
histograms can be constructed and maintained efficiently. In this paper,
we present both theoretical and experimental results on several issues related
to this trade-off. Our overall conclusion is that the most effective approach
is to focus on the class of histograms that accurately maintain the frequencies
of a few attribute values and assume the uniform distribution for the rest,
and choose for each relation the histogram in that class that is optimal
for a self-join query.
Copyright © 1995 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Online Version (ACM WWW Account required): Full Text in PDF Format
CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Michael J. Carey, Donovan A. Schneider (Eds.):
Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, May 22-25, 1995.
ACM Press 1995 BibTeX
,
SIGMOD Record 24(2),
June 1995
Contents
[Index Terms]
[Full Text in PDF Format, 1342 KB]
References
- [1]
- Chung-Min Chen, Nick Roussopoulos:
Adaptive Selectivity Estimation Using Query Feedback.
SIGMOD Conference 1994: 161-172 BibTeX
- [2]
- Stavros Christodoulakis:
Estimating Block Transfers and Join Sizes.
SIGMOD Conference 1983: 40-54 BibTeX
- [3]
- Stavros Christodoulakis:
Implications of Certain Assumptions in Database Performance Evaluation.
ACM Trans. Database Syst. 9(2): 163-186(1984) BibTeX
- [4]
- ...
- [5]
- ...
- [6]
- Christos Faloutsos, H. V. Jagadish:
On B-Tree Indices for Skewed Distributions.
VLDB 1992: 363-374 BibTeX
- [7]
- Peter J. Haas, Arun N. Swami:
Sequential Sampling Procedures for Query Size Estimation.
SIGMOD Conference 1992: 341-350 BibTeX
- [8]
- Peter J. Haas, Arun N. Swami:
Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics.
ICDE 1995: 522-531 BibTeX
- [9]
- Yannis E. Ioannidis:
Universality of Serial Histograms.
VLDB 1993: 256-267 BibTeX
- [10]
- Yannis E. Ioannidis, Stavros Christodoulakis:
On the Propagation of Errors in the Size of Join Results.
SIGMOD Conference 1991: 268-277 BibTeX
- [11]
- ...
- [12]
- ...
- [13]
- Nabil Kamel, Roger King:
A Model of Data Distribution Based on Texture Analysis.
SIGMOD Conference 1985: 319-325 BibTeX
- [14]
- ...
- [15]
- Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11 BibTeX
- [16]
- Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988) BibTeX
- [17]
- ...
- [18]
- T. H. Merrett, Ekow J. Otoo:
Distribution Models of Relations.
VLDB 1979: 418-425 BibTeX
- [19]
- M. Muralikrishna, David J. DeWitt:
Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries.
SIGMOD Conference 1988: 28-36 BibTeX
- [20]
- B. Muthuswamy, Larry Kerschberg:
A Detailed Database Statistics Model for Realtional Query Optimization.
ACM Annual Conference - The range of computing: mid-80's perspective 1985: 439-448 BibTeX
- [21]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276 BibTeX
- [22]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34 BibTeX
- [23]
- Yun Wang:
Experience from a Real Life Query Optimizer.
SIGMOD Conference 1992: 286 BibTeX
- [24]
- George Kingsley Zipf:
Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology.
Addison-Wesley 1949
BibTeX
Referenced by
- Francesco Buccafurri, Filippo Furfaro, Domenico Saccà:
Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach.
ICDT 2001: 390-404
- Nick Koudas, S. Muthukrishnan, Divesh Srivastava:
Optimal Histograms for Hierarchical Range Queries.
PODS 2000: 196-204
- Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis:
Approximate Query Answering using Histograms.
IEEE Data Eng. Bull. 22(4): 5-14(1999)
- H. V. Jagadish, Nick Koudas, S. Muthukrishnan:
Mining Deviants in a Time Series Database.
VLDB 1999: 102-113
- Arnd Christian König, Gerhard Weikum:
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation.
VLDB 1999: 423-434
- H. V. Jagadish, Olga Kapitskaia, Raymond T. Ng, Divesh Srivastava:
Multi-Dimensional Substring Selectivity Estimation.
VLDB 1999: 387-398
- Viswanath Poosala, Venkatesh Ganti:
Fast Approximate Answers to Aggregate Queries on a Data Cube.
SSDBM 1999: 24-33
- Ju-Hong Lee, Deok-Hwan Kim, Chin-Wan Chung:
Multi-dimensional Selectivity Estimation Using Compressed Histogram Information.
SIGMOD Conference 1999: 205-214
- H. V. Jagadish, Raymond T. Ng, Divesh Srivastava:
Substring Selectivity Estimation.
PODS 1999: 249-260
- Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy:
Tracking Join and Self-Join Sizes in Limited Storage.
PODS 1999: 10-20
- S. Muthukrishnan, Viswanath Poosala, Torsten Suel:
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications.
ICDT 1999: 236-256
- Pedro Furtado, Henrique Madeira:
Summary Grids: Building Accurate Multidimensional Histograms.
DASFAA 1999: 187-194
- Joseph M. Hellerstein:
Optimization Techniques for Queries with Expensive Methods.
ACM Trans. Database Syst. 23(2): 113-157(1998)
- M. Seetha Lakshmi, Shaoyu Zhou:
Selectivity Estimation in Extensible Databases - A Neural Network Approach.
VLDB 1998: 623-627
- H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Viswanath Poosala, Kenneth C. Sevcik, Torsten Suel:
Optimal Histograms with Quality Guarantees.
VLDB 1998: 275-286
- Navin Kabra, David J. DeWitt:
Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans.
SIGMOD Conference 1998: 106-117
- Phillip B. Gibbons, Yossi Matias:
New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
SIGMOD Conference 1998: 331-342
- Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya:
Random Sampling for Histogram Construction: How much is enough?
SIGMOD Conference 1998: 436-447
- Hubert Naacke, Georges Gardarin, Anthony Tomasic:
Leveraging Mediator Cost Models with Heterogeneous Data Sources.
ICDE 1998: 351-360
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997)
- Viswanath Poosala, Yannis E. Ioannidis:
Selectivity Estimation Without the Attribute Value Independence Assumption.
VLDB 1997: 486-495
- Christos Faloutsos, H. V. Jagadish, Nikolaos Sidiropoulos:
Recovering Information from Summary Data.
VLDB 1997: 36-45
- Yannis E. Ioannidis:
Query Optimization.
ACM Comput. Surv. 28(1): 121-123(1996)
- Viswanath Poosala, Yannis E. Ioannidis:
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing.
VLDB 1996: 448-459
- Christos Faloutsos, Yossi Matias, Abraham Silberschatz:
Modeling Skewed Distribution Using Multifractals and the `80-20' Law.
VLDB 1996: 307-317
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305
- P. Krishnan, Jeffrey Scott Vitter, Balakrishna R. Iyer:
Estimating Alphanumeric Selectivity in the Presence of Wildcards.
SIGMOD Conference 1996: 282-293
- Yannis E. Ioannidis, Viswanath Poosala:
Histogram-Based Solutions to Diverse Database Estimation Problems.
IEEE Data Eng. Bull. 18(3): 10-18(1995)
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:26 2009