Dynamic Maintenance of Data Distribution for Selectivity Estimation.
Kyu-Young Whang, Sang-Wook Kim, Gio Wiederhold:
Dynamic Maintenance of Data Distribution for Selectivity Estimation.
VLDB J. 3(1): 29-51(1994)@article{DBLP:journals/vldb/WhangKW94,
author = {Kyu-Young Whang and
Sang-Wook Kim and
Gio Wiederhold},
title = {Dynamic Maintenance of Data Distribution for Selectivity Estimation},
journal = {VLDB J.},
volume = {3},
number = {1},
year = {1994},
pages = {29-51},
ee = {db/journals/vldb/WhangKW94.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
We propose a new dynamic method for multidimensional selectivity estimation
for range queries that works accurately independent of data distribution.
Good estimation of selectivity is important for query optimization and physical
database design. Our method employs the multilevel grid file (MLGF) for
accurate estimation of multidimensional data distribution. The MLGF is a dynamic,
hierarchical, balanced, multidimensional file structure that gracefully adapts to
nonuniform and correlated distributions. We show that the MLGF directory naturally
represents a multidimensional data distribution. We then extent if for further
refinement and present the selectivity estimation method based on the MLGF.
Extensive experiments have been performed to test the accuracy of
selectivity estimation.
The results show that estimation errors are very small independent of
distributions, even with coorelated and/or highly skewed ones.
Finally, we analyze the cause of errors in estimation and investigate the
effects of various parameters on the accuracy of estimation.
Copyright © 1994 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Key Words
Query optimization,
physical database design,
multidimensional file structure,
multilevel grid files.
Online Paper
CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
BibTeX
References
- [Astrahan et al. 1976]
- Morton M. Astrahan, Mike W. Blasgen, Donald D. Chamberlin, Kapali P. Eswaran, Jim Gray, Patricia P. Griffiths, W. Frank King III, Raymond A. Lorie, Paul R. McJones, James W. Mehl, Gianfranco R. Putzolu, Irving L. Traiger, Bradford W. Wade, Vera Watson:
System R: Relational Approach to Database Management.
ACM Trans. Database Syst. 1(2): 97-137(1976) BibTeX
- [Chen et al. 1990]
- Meng Chang Chen, Lawrence McNamee, Norman S. Matloff:
Selectivity Estimation Using Homogeneity Measurement.
ICDE 1990: 304-310 BibTeX
- [Christodoulakis 1083]
- Stavros Christodoulakis:
Estimating record selectivities.
Inf. Syst. 8(2): 105-115(1983) BibTeX
- [Fagin et al. 1979]
- Ronald Fagin, Jürg Nievergelt, Nicholas Pippenger, H. Raymond Strong:
Extendible Hashing - A Fast Access Method for Dynamic Files.
ACM Trans. Database Syst. 4(3): 315-344(1979) BibTeX
- [Freeston 1987]
- Michael Freeston:
The BANG File: A New Kind of Grid File.
SIGMOD Conference 1987: 260-269 BibTeX
- [Ioannidis & Christodoulakis 1991]
- Yannis E. Ioannidis, Stavros Christodoulakis:
On the Propagation of Errors in the Size of Join Results.
SIGMOD Conference 1991: 268-277 BibTeX
- [Mannino et al. 1988]
- Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988) BibTeX
- [Muralikrishna & DeWitt 1988]
- M. Muralikrishna, David J. DeWitt:
Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries.
SIGMOD Conference 1988: 28-36 BibTeX
- [Muthuswamy & Kershberg 1985]
- B. Muthuswamy, Larry Kerschberg:
A Detailed Database Statistics Model for Realtional Query Optimization.
ACM Annual Conference - The range of computing: mid-80's perspective 1985: 439-448 BibTeX
- [Nievergelt et al. 1984]
- Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik:
The Grid File: An Adaptable, Symmetric Multikey File Structure.
ACM Trans. Database Syst. 9(1): 38-71(1984) BibTeX
- [Otoo 1984]
- Ekow J. Otoo:
A Mapping Function for the Directory of a Multidimensional Extendible Hashing.
VLDB 1984: 493-506 BibTeX
- [Otoo 1986]
- Ekow J. Otoo:
Balanced Multidimensional Extendible Hash Tree.
PODS 1986: 100-113 BibTeX
- [Piatetsky & Connell 1984]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276 BibTeX
- [Robinson 1981]
- John T. Robinson:
The K-D-B-Tree: A Search Structure For Large Multidimensional Dynamic Indexes.
SIGMOD Conference 1981: 10-18 BibTeX
- [Robinson 1986]
- John T. Robinson:
Order Preserving Linear Hashing Using Dynamic Key Statistics.
PODS 1986: 91-99 BibTeX
- [Selinger et al. 1979]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34 BibTeX
- [Selinger 1991]
- ...
- [Vander Zander et al. 1986]
- Brad T. Vander Zanden, Howard M. Taylor, Dina Bitton:
Estimating Block Accessses when Attributes are Correlated.
VLDB 1986: 119-127 BibTeX
- [Whang & Krihsnamurthy 1985]
- ...
- [Whang & Krishnamurthy 1990]
- Kyu-Young Whang, Ravi Krishnamurthy:
Query Optimization in a Memory-Resident Domain Relational Calculus Database System.
ACM Trans. Database Syst. 15(1): 67-95(1990) BibTeX
- [Whang et al. 1990]
- Kyu-Young Whang, Brad T. Vander Zanden, Howard M. Taylor:
A Linear-Time Probabilistic Counting Algorithm for Database Applications.
ACM Trans. Database Syst. 15(2): 208-229(1990) BibTeX
- [Whang & Krishnamurthy 1991]
- Kyu-Young Whang, Ravi Krishnamurthy:
The Multilevel Grid File - A Dynamic Hierarchical Multidimensional File Structure.
DASFAA 1991: 449-459 BibTeX
- [Wiederhold 1983]
- ...
- [Wiederhold 1987]
- Gio Wiederhold:
File Organisation for Database Design.
McGraw-Hill Book Company 1987, ISBN 0-07-100340-1
BibTeX
Referenced by
- Ju-Hong Lee, Deok-Hwan Kim, Chin-Wan Chung:
Multi-dimensional Selectivity Estimation Using Compressed Histogram Information.
SIGMOD Conference 1999: 205-214
- Paul M. Aoki:
Generalizing ``Search'' in Generalized Search Trees (Extended Abstract).
ICDE 1998: 380-389
- Jong-Hak Lee, Young-Koo Lee, Kyu-Young Whang, Il-Yeol Song:
A Region Splitting Strategy for Physical Database Design of Multidimensional File Organizations.
VLDB 1997: 416-425
- Brian Lent, Arun N. Swami, Jennifer Widom:
Clustering Association Rules.
ICDE 1997: 220-231
- Sang-Wook Kim, Wan-Sup Cho, Min-Jae Lee, Kyu-Young Whang:
A New Algorithm for Processing Joins Using the Multilevel Grid File.
DASFAA 1995: 115-123
- Shashi Shekhar, Babak Hamidzadeh, Ashim Kohli, Mark Coyle:
Learning Transformation Rules for Semantic Query Optimization: A Data-Driven Approach.
IEEE Trans. Knowl. Data Eng. 5(6): 950-964(1993)
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:20 2009