ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension.

Alberto Belussi, Christos Faloutsos: Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension. VLDB 1995: 299-310
@inproceedings{DBLP:conf/vldb/BelussiF95,
  author    = {Alberto Belussi and
               Christos Faloutsos},
  editor    = {Umeshwar Dayal and
               Peter M. D. Gray and
               Shojiro Nishio},
  title     = {Estimating the Selectivity of Spatial Queries Using the `Correlation'
               Fractal Dimension},
  booktitle = {VLDB'95, Proceedings of 21th International Conference on Very
               Large Data Bases, September 11-15, 1995, Zurich, Switzerland},
  publisher = {Morgan Kaufmann},
  year      = {1995},
  isbn      = {1-55860-379-4},
  pages     = {299-310},
  ee        = {db/conf/vldb/BelussiF95.html},
  crossref  = {DBLP:conf/vldb/95},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We examine the estimation of selectivities for range and spatial join queries in real spatial databases. As we have shown earlier [FK94], real point sets: (a) violate consistentlythe "uniformity" and "independence" assumptions, (b) can often be described as "fractals", with non-integer (fractal) dimension. In this paper we show that, among the infinite family of fractal dimensions, the so called "Correlation Dimension" D2 is the one that we need to predict the selectivity of spatial join.

The main contribution is that, for all the real and synthetic point-sets we tried, the average number of neighbors for a given point of the point-set follows a power law, with D2 as the exponent. This immediately solves the selectivity estimation for spatial joins, as well as for "biased" range queries (i.e., queries whose centers prefer areas of high point density).

We present the formulas to estimate the selectivity for the biased queries, including an integration constant (Kshape!,) for each query shape. Finally, we show results on real and synthetic point sets, where our formulas achieve very low relative errors (typically about 10%, versus 40%-100% of the uniformity assumption).

Copyright © 1995 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Umeshwar Dayal, Peter M. D. Gray, Shojiro Nishio (Eds.): VLDB'95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland. Morgan Kaufmann 1995, ISBN 1-55860-379-4
Contents BibTeX

References

[ACF+93]
Manish Arya, William F. Cody, Christos Faloutsos, Joel E. Richardson, Arthur Toya: QBISM: A Prototype 3-D Medical Image Database System. IEEE Data Eng. Bull. 16(1): 38-42(1993) BibTeX
[AIS93]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
[AS91]
Walid G. Aref, Hanan Samet: Optimization for Spatial Query Processing. VLDB 1991: 81-90 BibTeX
[AS94]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[BF95]
...
[BKSS90]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331 BibTeX
[BS88]
...
[CE92]
...
[Chr84]
Stavros Christodoulakis: Implications of Certain Assumptions in Database Performance Evaluation. ACM Trans. Database Syst. 9(2): 163-186(1984) BibTeX
[DKPY94]
David J. DeWitt, Navin Kabra, Jun Luo, Jignesh M. Patel, Jie-Bing Yu: Client-Server Paradise. VLDB 1994: 558-569 BibTeX
[Fal92]
...
[FBF+94]
Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz: Efficient and Effective Querying by Image Content. J. Intell. Inf. Syst. 3(3/4): 231-262(1994) BibTeX
[FJ92]
Christos Faloutsos, H. V. Jagadish: On B-Tree Indices for Skewed Distributions. VLDB 1992: 363-374 BibTeX
[FK94]
Christos Faloutsos, Ibrahim Kamel: Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension. PODS 1994: 4-13 BibTeX
[FRM94]
Christos Faloutsos, M. Ranganathan, Yannis Manolopoulos: Fast Subsequence Matching in Time-Series Databases. SIGMOD Conference 1994: 419-429 BibTeX
[FSR87]
Christos Faloutsos, Timos K. Sellis, Nick Roussopoulos: Analysis of Object Oriented Spatial Access Methods. SIGMOD Conference 1987: 426-439 BibTeX
[Gra90]
...
[Gut84]
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57 BibTeX
[IC91]
Yannis E. Ioannidis, Stavros Christodoulakis: On the Propagation of Errors in the Size of Join Results. SIGMOD Conference 1991: 268-277 BibTeX
[IC94]
Yannis E. Ioannidis, Stavros Christodoulakis: Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results. ACM Trans. Database Syst. 18(4): 709-748(1993) BibTeX
[KF94]
Ibrahim Kamel, Christos Faloutsos: Hilbert R-tree: An Improved R-tree using Fractals. VLDB 1994: 500-509 BibTeX
[Man77]
...
[MD88]
M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988: 28-36 BibTeX
[NS86]
...
[Ore86]
Jack A. Orenstein: Spatial Query Processing in an Object-Oriented Database System. SIGMOD Conference 1986: 326-336 BibTeX
[Ore90]
Jack A. Orenstein: A Comparison of Spatial Query Processing Techniques for Native and Parameter Spaces. SIGMOD Conference 1990: 343-352 BibTeX
[PSTW93]
Bernd-Uwe Pagel, Hans-Werner Six, Heinrich Toben, Peter Widmayer: Towards an Analysis of Range Query Performance in Spatial Data Structures. PODS 1993: 214-221 BibTeX
[RKV95]
Nick Roussopoulos, Stephen Kelley, Frédéic Vincent: Nearest Neighbor Queries. SIGMOD Conference 1995: 71-79 BibTeX
[Sch88]
...
[Sch91]
...
[SFGM93]
Michael Stonebraker, James Frew, Kenn Gardels, Jeff Meredith: The Sequoia 2000 Benchmark. SIGMOD Conference 1993: 2-11 BibTeX
[Smi92]
...
[Zip49]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
BibTeX

Referenced by

  1. Christos Faloutsos, Bernhard Seeger, Agma J. M. Traina, Caetano Traina Jr.: Spatial Join Selectivity Using Power Laws. SIGMOD Conference 2000: 177-188
  2. Christian Böhm, Hans-Peter Kriegel: Dynamically Optimizing High-Dimensional Index Structures. EDBT 2000: 36-50
  3. Ju-Hong Lee, Deok-Hwan Kim, Chin-Wan Chung: Multi-dimensional Selectivity Estimation Using Compressed Histogram Information. SIGMOD Conference 1999: 205-214
  4. Swarup Acharya, Viswanath Poosala, Sridhar Ramaswamy: Selectivity Estimation in Spatial Databases. SIGMOD Conference 1999: 13-24
  5. Joseph M. Hellerstein, Lisa Hellerstein, George Kollios: On the Generation of 2-Dimensional Index Workloads. ICDT 1999: 113-130
  6. Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft: When Is ''Nearest Neighbor'' Meaningful? ICDT 1999: 217-235
  7. Guido Proietti, Christos Faloutsos: I/O Complexity for Range Queries on Region Data Stored Using an R-tree. ICDE 1999: 628-635
  8. Volker Gaede, Oliver Günther: Multidimensional Access Methods. ACM Comput. Surv. 30(2): 170-231(1998)
  9. Apostolos Papadopoulos, Yannis Manolopoulos: Similarity Query Processing Using Disk Arrays. SIGMOD Conference 1998: 225-236
  10. Elias Koutsoupias, David Scot Taylor: Tight Bounds for 2-Dimensional Indexing Schemes. PODS 1998: 52-58
  11. Paolo Ciaccia, Marco Patella, Pavel Zezula: A Cost Model for Similarity Queries in Metric Spaces. PODS 1998: 59-68
  12. Joseph M. Hellerstein, Elias Koutsoupias, Christos H. Papadimitriou: On the Analysis of Indexing Schemes. PODS 1997: 249-256
  13. Apostolos Papadopoulos, Yannis Manolopoulos: Performance of Nearest Neighbor Queries in R-Trees. ICDT 1997: 394-408
  14. Christos Faloutsos, Yossi Matias, Abraham Silberschatz: Modeling Skewed Distribution Using Multifractals and the `80-20' Law. VLDB 1996: 307-317
  15. Christos Faloutsos, Volker Gaede: Analysis of n-Dimensional Quadtrees using the Hausdorff Fractal Dimension. VLDB 1996: 40-50
  16. Bernd-Uwe Pagel, Hans-Werner Six: Are Window Queries Representative for Arbitrary Range Queries? PODS 1996: 150-160
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:05 2009