ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Unstructured Data Bases or Very Efficient Text Searching.

Gaston H. Gonnet: Unstructured Data Bases or Very Efficient Text Searching. PODS 1983: 117-124
@inproceedings{DBLP:conf/pods/Gonnet83,
  author    = {Gaston H. Gonnet},
  title     = {Unstructured Data Bases or Very Efficient Text Searching},
  booktitle = {Proceedings of the Second ACM SIGACT-SIGMOD Symposium on Principles
               of Database Systems, March 21-23, 1983, Colony Square Hotel,
               Atlanta, Georgia},
  publisher = {ACM},
  year      = {1983},
  isbn      = {0-89791-097-4},
  pages     = {117-124},
  ee        = {http://doi.acm.org/10.1145/588058.588074, db/conf/pods/Gonnet83.html},
  crossref  = {DBLP:conf/pods/83},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We present several algorithms to search data bases that consist of text. The algorithms apply mostly to very large data bases that are difficult to structure.

We describe algorithms which search the original data base without transformation and hence could be used as general text searching algorithms. We also describe algorithms requiring preprocessing, the best of them achieving a logarithmic behaviour. These efficient algorithms solve the "plagiarism" problem among n papers

The problem of misspellings, ambiguous spellings, simple errors, endings, positional information, etc is nicely treated using signature functions.

Copyright © 1983 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Second ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 21-23, 1983, Colony Square Hotel, Atlanta, Georgia. ACM 1983, ISBN 0-89791-097-4
Contents BibTeX

Online Edition: ACM Digital Library


References

[1]
Alfred V. Aho, Margaret J. Corasick: Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM 18(6): 333-340(1975) BibTeX
[2]
Burton H. Bloom: Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13(7): 422-426(1970) BibTeX
[3]
Robert S. Boyer, J. Strother Moore: A Fast String Searching Algorithm. Commun. ACM 20(10): 762-772(1977) BibTeX
[4]
Larry Carter, Robert W. Floyd, John Gill, George Markowsky, Mark N. Wegman: Exact and Approximate Membership Testers. STOC 1978: 59-65 BibTeX
[5]
Zvi Galil: On Improving the Worse Case Running Time of the Boyer-Moore String Matching Algorithm. Commun. ACM 22(9): 505-508(1979) BibTeX
[6]
...
[7]
R. Nigel Horspool: Practical Fast Searching in Strings. Softw., Pract. Exper. 10(6): 501-506(1980) BibTeX
[8]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
BibTeX
[9]
Donald E. Knuth, James H. Morris Jr., Vaughan R. Pratt: Fast Pattern Matching in Strings. SIAM J. Comput. 6(2): 323-350(1977) BibTeX
[10]
John L. Pfaltz, William J. Berman, Edgar M. Cagley: Partial-Match Retrieval Using Indexed Descriptor Files. Commun. ACM 23(9): 522-528(1980) BibTeX
[11]
Ronald L. Rivest: Partial-Match Retrieval Algorithms. SIAM J. Comput. 5(1): 19-50(1976) BibTeX
[12]
Charles S. Roberts: Partial-Match Via the Method of Superimposed Codes. Proceedings of the IEEE 67(12): 1624-1642(1979) BibTeX
[13]
Alan L. Tharp, Kuo-Chung Tai: The Practicality of Text Signatures for Accelerating String Searching. Softw., Pract. Exper. 12(1): 35-44(1982) BibTeX

Referenced by

  1. Dirk Van Gucht: On the Expressive Power of the Extended Relational Algebra for the Unnormalized Relational Model. PODS 1987: 302-312
  2. Dirk Van Gucht, Patrick C. Fischer: Some Classes of Multilevel Relational Structures. PODS 1986: 60-69
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:33:42 2009