Simple Random Sampling from Relational Databases.
Frank Olken, Doron Rotem:
Simple Random Sampling from Relational Databases.
VLDB 1986: 160-169@inproceedings{DBLP:conf/vldb/OlkenR86,
author = {Frank Olken and
Doron Rotem},
editor = {Wesley W. Chu and
Georges Gardarin and
Setsuo Ohsuga and
Yahiko Kambayashi},
title = {Simple Random Sampling from Relational Databases},
booktitle = {VLDB'86 Twelfth International Conference on Very Large Data Bases,
August 25-28, 1986, Kyoto, Japan, Proceedings},
publisher = {Morgan Kaufmann},
year = {1986},
isbn = {0-934613-18-4},
pages = {160-169},
ee = {db/conf/vldb/OlkenR86.html},
crossref = {DBLP:conf/vldb/86},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Sampling is a fundamental operation for the auditing
and statistical analysis of large databases. It is not
well supported in existing relational database management
systems. We discuss how to obtain samples
from the results of relational queries without first performing
the query. Specifically, we examine simple random
sampling from selections, projections, joins,
unions, and intersections. We discuss data structures
and algorithms for sampling, and their performance.
We show that samples of relational queries can often
be computed for a small fraction of the effort of computing
the entire relational query, i.e., in time proportional
to sample size, rather than time proportional to
the size of the full result of the relational query.
Copyright © 1986 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Wesley W. Chu, Georges Gardarin, Setsuo Ohsuga, Yahiko Kambayashi (Eds.):
VLDB'86 Twelfth International Conference on Very Large Data Bases, August 25-28, 1986, Kyoto, Japan, Proceedings.
Morgan Kaufmann 1986, ISBN 0-934613-18-4
Contents BibTeX
References
- [Chr84]
- Stavros Christodoulakis:
Implications of Certain Assumptions in Database Performance Evaluation.
ACM Trans. Database Syst. 9(2): 163-186(1984) BibTeX
- [Coc77]
- William G. Cochran:
Sampling Techniques, 3rd Edition.
John Wiley 1977, ISBN 0-471-16240-X
BibTeX
- [Dev86]
- ...
- [EN82]
- Jarmo Ernvall, Olli Nevalainen:
An Algorithm for Unbiased Random Sampling.
Comput. J. 25(1): 45-47(1982) BibTeX
- [FMR62]
- ...
- [Mor80]
- ...
- [OR86]
- ...
- [Vit84]
- Jeffrey Scott Vitter:
Faster Methods for Random Sampling.
Commun. ACM 27(7): 703-718(1984) BibTeX
- [Vit85]
- Jeffrey Scott Vitter:
Random Sampling with a Reservoir.
ACM Trans. Math. Softw. 11(1): 37-57(1985) BibTeX
- [WE80]
- C. K. Wong, Malcolm C. Easton:
An Efficient Method for Weighted Sampling Without Replacement.
SIAM J. Comput. 9(1): 111-113(1980) BibTeX
- [Wil84]
- Dan E. Willard:
Sampling Algorithms for Differential Batch Retrieval Problems (Extended Abstract).
ICALP 1984: 514-526 BibTeX
- [Yao77]
- S. Bing Yao:
Approximating the Number of Accesses in Database Organizations.
Commun. ACM 20(4): 260-261(1977) BibTeX
Referenced by
- Surajit Chaudhuri, Rajeev Motwani:
On Sampling and Relational Operators.
IEEE Data Eng. Bull. 22(4): 41-46(1999)
- Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya:
On Random Sampling over Joins.
SIGMOD Conference 1999: 263-274
- H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Viswanath Poosala, Kenneth C. Sevcik, Torsten Suel:
Optimal Histograms with Quality Guarantees.
VLDB 1998: 275-286
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997)
- Nabil I. Hachem, Chenye Bao, Steve Taylor:
Approximate Query Answering In Numerical Databases.
SSDBM 1996: 63-73
- Augustine C. Ikeji, Farshad Fotouhi:
Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique.
CIKM 1995: 145-149
- Balakrishna R. Iyer, David Wilhite:
Data Compression Support in Databases.
VLDB 1994: 695-704
- Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami:
On the Relative Cost of Sampling for Join Selectivity Estimation.
PODS 1994: 14-24
- Qiang Zhu, Per-Åke Larson:
A Query Sampling Method of Estimating Local Cost Parameters in a Multidatabase System.
ICDE 1994: 144-153
- Wen-Chi Hou, Gultekin Özsoyoglu:
Processing Time-Constrained Aggregate Queries in CASE-DB.
ACM Trans. Database Syst. 18(2): 224-261(1993)
- Richard H. Wolniewicz, Goetz Graefe:
Algebraic Optimization of Computations over Scientific Databases.
VLDB 1993: 13-24
- Peter Bodorik, J. Spruce Riordon, James S. Pyra:
Deciding on Correct Distributed Query Processing.
IEEE Trans. Knowl. Data Eng. 4(3): 253-265(1992)
- Gennady Antoshenkov:
Random Sampling from Pseudo-Ranked B+ Trees.
VLDB 1992: 375-382
- Frank Olken, Doron Rotem:
Maintenance of Materialized Views of Sampling Queries.
ICDE 1992: 632-641
- S. Seshadri, Jeffrey F. Naughton:
Sampling Issues in Parallel Database Systems.
EDBT 1992: 328-343
- Wen-Chi Hou, Gultekin Özsoyoglu:
Statistical Estimators for Aggregate Relational Algebra Queries.
ACM Trans. Database Syst. 16(4): 600-654(1991)
- Kyu-Young Whang, Brad T. Vander Zanden, Howard M. Taylor:
A Linear-Time Probabilistic Counting Algorithm for Database Applications.
ACM Trans. Database Syst. 15(2): 208-229(1990)
- Frank Olken, Doron Rotem:
Random Sampling from Database Files: A Survey.
SSDBM 1990: 92-111
- Frank Olken, Doron Rotem, Ping Xu:
Random Sampling from Hash Files.
SIGMOD Conference 1990: 375-386
- Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11
- Richard J. Lipton, Jeffrey F. Naughton:
Query Size Estimation by Adaptive Sampling.
PODS 1990: 40-46
- Balakrishna R. Iyer, Daniel M. Dias:
System Issues in Parallel Sorting for Database Systems.
ICDE 1990: 246-255
- Jaideep Srivastava, Jack S. Eddy Tan, Vincent Y. Lum:
TBSAM: An Access Method for Efficient Processing of Statistical Queries.
IEEE Trans. Knowl. Data Eng. 1(4): 414-423(1989)
- Balakrishna R. Iyer, Gary R. Ricard, Peter J. Varman:
Percentile Finding Algorithm for Multiple Sorted Runs.
VLDB 1989: 135-144
- Jaideep Srivastava, Doron Rotem:
Precision-Time Tradeoffs: A Paradigm for Processing Statistical Queries on Databases.
SSDBM 1988: 226-245
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Statistical Estimators for Relational Algebra Expressions.
PODS 1988: 276-287
- Don S. Batory:
Concepts for a Database System Compiler.
PODS 1988: 184-192
- Jaideep Srivastava, Vincent Y. Lum:
A Tree Based Access Method (TBSAM) for Fast Processing of Aggregate Queries.
ICDE 1988: 504-510
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:28 2009