Space/Time Trade-offs in Hash Coding with Allowable Errors.
Burton H. Bloom:
Space/Time Trade-offs in Hash Coding with Allowable Errors.
Commun. ACM 13(7): 422-426(1970)@article{DBLP:journals/cacm/Bloom70,
author = {Burton H. Bloom},
title = {Space/Time Trade-offs in Hash Coding with Allowable Errors},
journal = {Commun. ACM},
volume = {13},
number = {7},
year = {1970},
pages = {422-426},
ee = {db/journals/cacm/Bloom70.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
In this paper trade-offs among certain computational factors
in hash coding are analyzed.
The paradigm problem considered is that of testing a series of messages one-by-one
for membership in a given set of messages. Two new hash-coding methods are examined
and compared with a particular conventional hash-coding method.
The computational factors considered are the size of the hash area
(space), the time required to identify a message as a nonmember of the
given set (reject time), and an allowable error frequency.
The new methods
are intended to reduce the amount of space required to contain the hash-coded
information from that associated with conventional methods. The reduction in
space is accomplished by exploiting the possibility that a small fraction of
errors of commission may be tolerable in some applications, in particular,
applications in which a large amount of data is involved and a core resident
hash area is consequently not feasible using conventional methods.
In such
applications, it is envisaged that overall performance
could be improved by using a smaller core resident hash area in
conjunction with the new methods and, when necessary, by using some
secondary and perhaps time-consuming test to "catch" the small
fraction of errors associated with new methods. An example is discussed
which illustrates possible areas of application for the new
methods.
Analysis of the paradigm problem demonstrates that allowing
a small number of test messages to be falsely identified as
members of the given set will permit a much smaller hash
area to be used without increasing reject time.
Copyright © 1970 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
BibTeX
References
- [1]
- ...
- [2]
- ...
- [3]
- ...
Referenced by
- Alfons Kemper, Donald Kossmann, Christian Wiesner:
Generalised Hash Teams for Join and Group-by.
VLDB 1999: 30-41
- Charles Bontempo, George Zagelow:
The IBM Data Warehouse Architecture.
Commun. ACM 41(9): 38-48(1998)
- Theodore Johnson:
Coarse Indices for a Tape-Based Data Warehouse.
ICDE 1998: 231-240
- Praveen Seshadri, Joseph M. Hellerstein, Hamid Pirahesh, T. Y. Cliff Leung, Raghu Ramakrishnan, Divesh Srivastava, Peter J. Stuckey, S. Sudarshan:
Cost-Based Optimization for Magic: Algebra and Implementation.
SIGMOD Conference 1996: 435-446
- Zhe Li, Kenneth A. Ross:
PERF Join: An Alternative to Two-way Semijoin and Bloomjoin.
CIKM 1995: 137-144
- Goetz Graefe:
Query Evaluation Techniques for Large Databases.
ACM Comput. Surv. 25(2): 73-170(1993)
- Priti Mishra, Margaret H. Eich:
Join Processing in Relational Databases.
ACM Comput. Surv. 24(1): 63-113(1992)
- Arie Segev, Jooseok Park:
Updating Distributed Materialized Views.
IEEE Trans. Knowl. Data Eng. 1(2): 173-184(1989)
- Arie Segev, Himawan Gunadhi:
Event-Join Optimization in Temporal Relational Databases.
VLDB 1989: 205-215
- Alan J. Kent, Ron Sacks-Davis, Kotagiri Ramamohanarao:
A Superimposed Coding Scheme Based on Multiple Block Descriptor Files for Indexing Very Large Data Bases.
VLDB 1988: 351-359
- Ming-Chien Shan:
Optimal Plan Search in a Rule-Based Query Optimizer.
EDBT 1988: 92-112
- Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao:
Multikey Access Methods Based on Superimposed Coding Techniques.
ACM Trans. Database Syst. 12(4): 655-696(1987)
- Eric N. Hanson:
A Performance Analysis of View Materialization Strategies.
SIGMOD Conference 1987: 440-453
- Lothar F. Mackert, Guy M. Lohman:
R* Optimizer Validation and Performance Evaluation for Distributed Queries.
VLDB 1986: 149-159
- Lothar F. Mackert, Guy M. Lohman:
R* Optimizer Validation and Performance Evaluation for Local Queries.
SIGMOD Conference 1986: 84-95
- Giovanni Maria Sacco:
Distributed Query Evaluation in Local Area Networks.
ICDE 1984: 510-516
- Gaston H. Gonnet:
Unstructured Data Bases or Very Efficient Text Searching.
PODS 1983: 117-124
- Houtan Aghili, Dennis G. Severance:
A Practical Guide to the Design of Differential Files for Recovery of On-Line Databases.
ACM Trans. Database Syst. 7(4): 540-565(1982)
- Dennis G. Severance, Guy M. Lohman:
Differential Files: Their Application to the Maintenance of Large Databases.
ACM Trans. Database Syst. 1(3): 256-267(1976)
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
CACM, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:51:45 2009