ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison.

Christos Faloutsos, Raphael Chan: Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison. VLDB 1988: 280-293
@inproceedings{DBLP:conf/vldb/FaloutsosC88,
  author    = {Christos Faloutsos and
               Raphael Chan},
  editor    = {Fran\c{c}ois Bancilhon and
               David J. DeWitt},
  title     = {Fast Text Access Methods for Optical and Large Magnetic Disks:
               Designs and Performance Comparison},
  booktitle = {Fourteenth International Conference on Very Large Data Bases,
               August 29 - September 1, 1988, Los Angeles, California, USA,
               Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1988},
  isbn      = {0-934613-75-3},
  pages     = {280-293},
  ee        = {db/conf/vldb/FaloutsosC88.html},
  crossref  = {DBLP:conf/vldb/88},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

High capacity disks, especially optical ones, are co mercially available. These disks are ideal for archiving large text data bases. In this work, we examine efficient searching techniques for such applications. We propose a unifying framework, which reveals the similarities between signature files and an inverted file using a hash table. Then, we design methods that combine the ease of insertion of the signature files with the fast retrieval of the inverted files. We develop analytical models for their performance and we verify it through experimentation on a 2.8 Mb data base. The agreement between theory and experimentation is very good. The results show that the proposed methods achieve fast retrieval, they require a modest lo%-30% space overhead, (as opposed to 50%-300% overhead [13] for the inverted files), and they do not require rewriting; thus, they can handle insertions easily, they permit searches during an insertion and they can be used with write-once optical disks. Using our verified model, the performance predictions for the proposed methods on large data bases (e.g., 250 Mb) are very promising.

Copyright © 1988 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

François Bancilhon, David J. DeWitt (Eds.): Fourteenth International Conference on Very Large Data Bases, August 29 - September 1, 1988, Los Angeles, California, USA, Proceedings. Morgan Kaufmann 1988, ISBN 0-934613-75-3
BibTeX

References

[1]
...
[2]
Stavros Christodoulakis, Christos Faloutsos: Design Considerations for a Message File Server. IEEE Trans. Software Eng. 10(2): 201-210(1984) BibTeX
[3]
Stavros Christodoulakis, F. Ho, M. Theodoridou: The Multimedia Object Presentation Manager of MINOS: A Symmetric Approach. SIGMOD Conference 1986: 295-310 BibTeX
[4]
...
[5]
Christos Faloutsos: Access Methods for Text. ACM Comput. Surv. 17(1): 49-74(1985) BibTeX
[6]
...
[7]
...
[8]
Christos Faloutsos, Stavros Christodoulakis: Signature Files: An Access Method for Documents and Its Analytical Performance Evaluation. ACM Trans. Inf. Syst. 2(4): 267-288(1984) BibTeX
[9]
Christos Faloutsos, Stavros Christodoulakis: Description and Performance Analysis of Signature File Methods for Office Filing. ACM Trans. Inf. Syst. 5(3): 237-257(1987) BibTeX
[10]
Larry Fujitani: Laser Optical Disk: The Coming Revolution in On-Line Storage. Commun. ACM 27(6): 546-554(1984) BibTeX
[11]
...
[12]
Gaston H. Gonnet, Frank Wm. Tompa: Mind Your Grammar: a New Approach to Modelling Text. VLDB 1987: 339-346 BibTeX
[13]
...
[14]
...
[15]
...
[16]
Gary D. Knott: Expandable Open Addressing Hash Table Storage and Retrieval. SIGFIDET Workshop 1971: 187-206 BibTeX
[17]
...
[18]
...
[19]
James L. Peterson: Computer Programs for Detecting and Correcting Spelling Errors. Commun. ACM 23(12): 676-687(1980) BibTeX
[20]
John L. Pfaltz, William J. Berman, Edgar M. Cagley: Partial-Match Retrieval Using Indexed Descriptor Files. Commun. ACM 23(9): 522-528(1980) BibTeX
[21]
...
[22]
...
[23]
Kotagiri Ramamohanarao, John Shepherd: A Superimposed Codeword Indexing Scheme for Very Large Prolog Databases. ICLP 1986: 569-576 BibTeX
[24]
Ron Sacks-Davis, Kotagiri Ramamohanarao: A two level superimposed coding scheme for partial match retrieval. Inf. Syst. 8(4): 273-289(1983) BibTeX
[25]
Gerard Salton, Michael McGill: Introduction to Modern Information Retrieval. McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
BibTeX
[26]
Thomas A. Standish: An Essay on Software Reuse. IEEE Trans. Software Eng. 10(5): 494-497(1984) BibTeX
[27]
Craig Stanfill, Brewster Kahle: Parallel Free-Text Search on the Connection Machine System. Commun. ACM 29(12): 1229-1239(1986) BibTeX
[28]
...
[29]
George R. Thoma, S. Suthasinekul, F. L. Walker, J. Cookson, M. Rashidian: A Prototype System for the Electronic Storage and Retrieval of Document Images. ACM Trans. Inf. Syst. 3(3): 279-291(1985) BibTeX
[30]
Dennis Tsichritzis, Stavros Christodoulakis: Message Files. ACM Trans. Inf. Syst. 1(1): 88-98(1983) BibTeX

Referenced by

  1. Charu C. Aggarwal, Joel L. Wolf, Philip S. Yu: A New Method for Similarity Indexing of Market Basket Data. SIGMOD Conference 1999: 407-418
  2. Kjetil Nørvåg: Efficient Use of Signatures in Object-Oriented Database Systems. ADBIS 1999: 367-381
  3. George Panagopoulos, Christos Faloutsos: Bit-Sliced Signature Files for Very Large Text Databases an a Parallel Machine Architecture. EDBT 1994: 379-392
  4. Yoshiharu Ishikawa, Hiroyuki Kitagawa, Nobuo Ohbo: Evaluation of Signature Files as Set Access Facilities in OODBs. SIGMOD Conference 1993: 247-256
  5. Zheng Lin, Christos Faloutsos: Frame-Sliced Signature Files. IEEE Trans. Knowl. Data Eng. 4(3): 281-289(1992)
  6. Christos Faloutsos, H. V. Jagadish: Hybrid Index Organizations for Text Databases. EDBT 1992: 310-327
  7. Roger King, Ali Morfeq: Bayan: An Arabic Text Database Management System. SIGMOD Conference 1990: 12-23
  8. Jae-Woo Chang, Yoon-Joon Lee: Multikey Access Scheme Based on Term Discrimination and Signature Clustering. DASFAA 1989: 211-218
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:38 2009