Design of a Signature File Method that Accounts for Non-Uniform Occurrence and Query Frequencies.

Christos Faloutsos, Stavros Christodoulakis: Design of a Signature File Method that Accounts for Non-Uniform Occurrence and Query Frequencies. VLDB 1985: 165-170
  author    = {Christos Faloutsos and
               Stavros Christodoulakis},
  editor    = {Alain Pirotte and
               Yannis Vassiliou},
  title     = {Design of a Signature File Method that Accounts for Non-Uniform
               Occurrence and Query Frequencies},
  booktitle = {VLDB'85, Proceedings of 11th International Conference on Very
               Large Data Bases, August 21-23, 1985, Stockholm, Sweden},
  publisher = {Morgan Kaufmann},
  year      = {1985},
  pages     = {165-170},
  ee        = {db/conf/vldb/FaloutsosC85.html},
  crossref  = {DBLP:conf/vldb/85},
  bibsource = {DBLP,}


In this paper we study a variation of the signature file access method for text and attribute retrieval. According to this method, the documents (or records) are stored sequentially in the "text file". Abstractions ("signatures") of the documents (or records) are stored in the "signature file". The latter serves as a filter on retrieval: It helps discarding a large number of non-qualifying documents. We pro- pose a signature extraction method that takes into account the query and occurrence frequencies, thus achieving better performance. The model we present is general enough, so that results can be applied not only for text retrieval but also for files with formatted data.

Copyright © 1985 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Alain Pirotte, Yannis Vassiliou (Eds.): VLDB'85, Proceedings of 11th International Conference on Very Large Data Bases, August 21-23, 1985, Stockholm, Sweden. Morgan Kaufmann 1985
Contents BibTeX


Alfred V. Aho, Margaret J. Corasick: Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM 18(6): 333-340(1975) BibTeX
Robert S. Boyer, J. Strother Moore: A Fast String Searching Algorithm. Commun. ACM 20(10): 762-772(1977) BibTeX
Stavros Christodoulakis: Access Files for Batching Queries in Large Information Systems. ICOD 1983: 127-150 BibTeX
Stavros Christodoulakis, Christos Faloutsos: Design Considerations for a Message File Server. IEEE Trans. Software Eng. 10(2): 201-210(1984) BibTeX
Christos Faloutsos: Signature files: Design and Performance Comparison of Some Signature Extraction Methods. SIGMOD Conference 1985: 63-82 BibTeX
Roger L. Haskin, Raymond A. Lorie: On Extending the Functions of a Relational Database System. SIGMOD Conference 1982: 207-212 BibTeX
Donald E. Knuth, James H. Morris Jr., Vaughan R. Pratt: Fast Pattern Matching in Strings. SIAM J. Comput. 6(2): 323-350(1977) BibTeX
Ian A. Macleod: A data base management system for document retrieval applications. Inf. Syst. 6(2): 131-137(1981) BibTeX
Charles S. Roberts: Partial-Match Via the Method of Superimposed Codes. Proceedings of the IEEE 67(12): 1624-1642(1979) BibTeX
Gerard Salton, Michael McGill: Introduction to Modern Information Retrieval. McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
Dennis Tsichritzis, Stavros Christodoulakis: Message Files. ACM Trans. Inf. Syst. 1(1): 88-98(1983) BibTeX
Dennis Tsichritzis, Stavros Christodoulakis, P. Economopoulos, Christos Faloutsos, A. Lee, D. Lee, J. Vandenbroeck, Carson C. Woo: A Multimedia Office Filing System. VLDB 1983: 2-7 BibTeX
C. J. van Rijsbergen: Information Retrieval. Butterworth 1979, ISBN 0-408-70929-4
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949

Referenced by

  1. Lauri Malmi, Eljas Soisalon-Soininen: Group Updates for Relaxed Height-Balanced Trees. PODS 1999: 358-367
  2. Byoung Mo Im, Myoung-Ho Kim, Jae Soo Yoo: MIN-Entropy: A New Signature File Declustering Algorithm for Intra-Query Parallelism. DASFAA 1997: 235-242
  3. Kerttu Pollari-Malmi, Eljas Soisalon-Soininen, Tatu Ylönen: Concurrency Control in B-Trees with Batch Updates. IEEE Trans. Knowl. Data Eng. 8(6): 975-984(1996)
  4. Pavel Zezula, Paolo Ciaccia, Paolo Tiberio: Hamming Filters: A Dynamic Signature File Organization for Parallel Stores. VLDB 1993: 314-327
  5. Chun-Wu Roger Leng, Dik Lun Lee: Optimal Weight Assignment for Signature Generation. ACM Trans. Database Syst. 17(2): 346-373(1992)
  6. Walter W. Chang, Hans-Jörg Schek: A Signature Access Method for the Starburst Database System. VLDB 1989: 145-153
  7. Jae-Woo Chang, Yoon-Joon Lee: Multikey Access Scheme Based on Term Discrimination and Signature Clustering. DASFAA 1989: 211-218
  8. Alan J. Kent, Ron Sacks-Davis, Kotagiri Ramamohanarao: A Superimposed Coding Scheme Based on Multiple Block Descriptor Files for Indexing Very Large Data Bases. VLDB 1988: 351-359
  9. Soon Myoung Chung, P. Bruce Berra: A Comparison of Concatenated and Superimposed Code Word Surrogate Files for Very Large Data/Knowledge Bases. EDBT 1988: 364-387
  10. Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao: Multikey Access Methods Based on Superimposed Coding Techniques. ACM Trans. Database Syst. 12(4): 655-696(1987)
  11. Christos Faloutsos: Access Methods for Text. ACM Comput. Surv. 17(1): 49-74(1985)
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:45:25 2009