ACM SIGMOD Anthology TODS dblp.uni-trier.de

On the Complexity of Designing Optimal Partial-Match Retrieval Systems.

Shlomo Moran: On the Complexity of Designing Optimal Partial-Match Retrieval Systems. ACM Trans. Database Syst. 8(4): 543-551(1983)
@article{DBLP:journals/tods/Moran83,
  author    = {Shlomo Moran},
  title     = {On the Complexity of Designing Optimal Partial-Match Retrieval
               Systems},
  journal   = {ACM Trans. Database Syst.},
  volume    = {8},
  number    = {4},
  year      = {1983},
  pages     = {543-551},
  ee        = {http://doi.acm.org/10.1145/319996.320004, db/journals/tods/Moran83.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We consider the problem of designing an information retrieval system on which partial match queries have to be answered. Each record in the system consists of a list of attributes, and a partial match query specifies the values of some of the attributes. The records are stored in buckets in a secondary memory, and in order to answer a partial match query all the buckets that may contain a record satisfying the specifications of that query must be retrieved. The bucket in which a given record is stored is found by a multiple key hashing function, which maps each attribute to a string of a fixed number of bits. The address of that bucket is then represented by the string obtained by concatenating the strings on which the various attributes were mapped. A partial match query may specify only part of the bits in the string representing the address, and the larger the number of bits specified, the smaller the number of buckets that have to be retrieved in order to answer the query.

The optimization problem considered in this paper is that of deciding to how many bits each attribute should be mapped by the bashing function above, so that the expected number of buckets retrieved per query is minimized. Efficient solutions for special cases of this problem have been obtained in [1], [12], and [14]. It is shown that in general the problem is NP-hard, and that if P != NP, it is also not fully approximable. Two heuristic algorithms for the problem are also given and compared.

Copyright © 1983 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Joint ACM SIGMOD / IEEE Computer Society Anthology

CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Alfred V. Aho, Jeffrey D. Ullman: Optimal Partial-Match Retrieval When Fields Are Independently Specified. ACM Trans. Database Syst. 4(2): 168-179(1979) BibTeX
[2]
Azad Bolour: Optimality Properties of Multiple-Key Hashing Functions. J. ACM 26(2): 196-210(1979) BibTeX
[3]
...
[4]
M. R. Garey, David S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
BibTeX
[5]
M. R. Garey, David S. Johnson: ``Strong'' NP-Completeness Results: Motivation, Examples, and Implications. J. ACM 25(3): 499-508(1978) BibTeX
[6]
Ellis Horowitz, Sartaj Sahni: Exact and Approximate Algorithms for Scheduling Nonidentical Processors. J. ACM 23(2): 317-327(1976) BibTeX
[7]
Oscar H. Ibarra, Chul E. Kim: Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems. J. ACM 22(4): 463-468(1975) BibTeX
[8]
David S. Johnson, Alan J. Demers, Jeffrey D. Ullman, M. R. Garey, Ronald L. Graham: Worst-Case Performance Bounds for Simple One-Dimensional Packing Algorithms. SIAM J. Comput. 3(4): 299-325(1974) BibTeX
[9]
...
[10]
Shlomo Moran: General Approximation Algorithms for some Arithmetical Combinatorial Problems. Theor. Comput. Sci. 14: 289-303(1981) BibTeX
[11]
Azaria Paz, Shlomo Moran: Non Deterministic Polynomial Optimization Problems and their Approximations. Theor. Comput. Sci. 15: 251-277(1981) BibTeX
[12]
Ronald L. Rivest: Partial-Match Retrieval Algorithms. SIAM J. Comput. 5(1): 19-50(1976) BibTeX
[13]
Daniel J. Rosenkrantz, Richard Edwin Stearns, Philip M. Lewis II: An Analysis of Several Heuristics for the Traveling Salesman Problem. SIAM J. Comput. 6(3): 563-581(1977) BibTeX
[14]
James B. Rothnie Jr., Tomas Lozano: Attribute Based File Organization in a Paged Memory Environment. Commun. ACM 17(2): 63-69(1974) BibTeX
[15]
Jeffrey D. Ullman: Principles of Database Systems, 1st Edition. Computer Science Press 1980
BibTeX

Referenced by

  1. Doron Rotem: Clustered Multiattribute Hash Files. PODS 1989: 225-234
  2. Kotagiri Ramamohanarao, John Shepherd, Ron Sacks-Davis: Partial-match Retrieval using Multiple-Key Hashing with Multiple File Copies. DASFAA 1989: 225-232
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:38:52 2008