ACM SIGMOD Anthology VLDB dblp.uni-trier.de

An Efficient Indexing Technique for Full Text Databases.

Justin Zobel, Alistair Moffat, Ron Sacks-Davis: An Efficient Indexing Technique for Full Text Databases. VLDB 1992: 352-362
@inproceedings{DBLP:conf/vldb/ZobelMS92,
  author    = {Justin Zobel and
               Alistair Moffat and
               Ron Sacks-Davis},
  editor    = {Li-Yan Yuan},
  title     = {An Efficient Indexing Technique for Full Text Databases},
  booktitle = {18th International Conference on Very Large Data Bases, August
               23-27, 1992, Vancouver, Canada, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1992},
  isbn      = {1-55860-151-1},
  pages     = {352-362},
  ee        = {db/conf/vldb/ZobelMS92.html},
  crossref  = {DBLP:conf/vldb/92},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Full-text database systems require an index to allow fast access to documents based on their content. We propose an inverted file indexing scheme based on compression. This scheme allows users to retrieve documents using words occurring in the documents, sequences of adjacent words, and statistical ranking techniques. The compression methods chosen ensure that the storage requirements are small and that dynamic update is straightforward. The only assumption that we make is that sufficient main memory is available tosupport an in-memory vocabulary; given this assumption, the method we describe requires at most one disc access per query term to identify answers to queries.

Copyright © 1992 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Li-Yan Yuan (Ed.): 18th International Conference on Very Large Data Bases, August 23-27, 1992, Vancouver, Canada, Proceedings. Morgan Kaufmann 1992, ISBN 1-55860-151-1
Contents BibTeX

References

[BK91]
Abraham Bookstein, Shmuel T. Klein: Compression of a Set of Correlated Bitmaps. SIGIR 1991: 63-71 BibTeX
[BKR92]
...
[BWC89]
Timothy C. Bell, Ian H. Witten, John G. Cleary: Modeling for Text Compression. ACM Comput. Surv. 21(4): 557-591(1989) BibTeX
[CS88]
W. Bruce Croft, Pasquale Savino: Implementing Ranking Strategies Using Text Signatures. ACM Trans. Inf. Syst. 6(1): 42-62(1988) BibTeX
[Eli75]
...
[Fal85a]
Christos Faloutsos: Access Methods for Text. ACM Comput. Surv. 17(1): 49-74(1985) BibTeX
[Fal85b]
Christos Faloutsos: Signature files: Design and Performance Comparison of Some Signature Extraction Methods. SIGMOD Conference 1985: 63-82 BibTeX
[FK85]
...
[GV75]
...
[Has91]
...
[HC90]
...
[KSDR90]
...
[McI82]
...
[Mof92]
Alistair Moffat: Economical Inversion of Large Text Files. Computing Systems 5(2): 125-139(1992) BibTeX
[MZ92a]
...
[MZ92b]
Alistair Moffat, Justin Zobel: Parameterised Compression for Sparse Bitmaps. SIGIR 1992: 274-285 BibTeX
[SDKR87]
Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao: Multikey Access Methods Based on Superimposed Coding Techniques. ACM Trans. Database Syst. 12(4): 655-696(1987) BibTeX
[SFW83]
Gerard Salton, Edward A. Fox, Harry Wu: Extended Boolean Information Retrieval. Commun. ACM 26(11): 1022-1036(1983) BibTeX
[SM83]
Gerard Salton, Michael McGill: Introduction to Modern Information Retrieval. McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
BibTeX
[Teu78]
Jukka Teuhola: A Compression Method for Clustered Bit-Vectors. Inf. Process. Lett. 7(6): 308-311(1978) BibTeX
[WBN91]
...
[WLO+85]
Harry K. T. Wong, Hsiu-Fen Liu, Frank Olken, Doron Rotem, Linda Wong: Bit Transposed Files. VLDB 1985: 448-457 BibTeX
[Zip49]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
BibTeX
[ZM92a]
...
[ZM92b]
...
[ZTSD91]
Justin Zobel, James A. Thom, Ron Sacks-Davis: Efficiency of Nested Relational Document Database Systems. VLDB 1991: 91-102 BibTeX

Referenced by

  1. Beng Chin Ooi, Kian-Lee Tan, Tat-Seng Chua, Wynne Hsu: Fast Image Retrieval Using Color-Spatial Information. VLDB J. 7(2): 115-128(1998)
  2. Justin Zobel, Alistair Moffat, Kotagiri Ramamohanarao: Inverted Files Versus Signature Files for Text Indexing. ACM Trans. Database Syst. 23(4): 453-490(1998)
  3. Björn Þór Jónsson, Michael J. Franklin, Divesh Srivastava: Interaction of Query Evaluation and Buffer Management for Information Retrieval. SIGMOD Conference 1998: 118-129
  4. Ron Sacks-Davis: The Structured Information Manager: A Database System for SGML Documents. VLDB 1996: 596
  5. Maxim Martynov, Boris Novikov: An Indexing Algorithm for Text Retrieval. ADBIS 1996: 171-175
  6. Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao, James A. Thom, Justin Zobel: Atlas: A Nested Relational Database System for Text Applications. IEEE Trans. Knowl. Data Eng. 7(3): 454-470(1995)
  7. Arthur M. Keller, Jeffrey D. Ullman: A Version Numbering Scheme with a Useful Lexicographical Order. ICDE 1995: 240-248
  8. Charles L. Viles, James C. French: On the Update of Term Weights in Dynamic Information Retrieval Systems. CIKM 1995: 167-174
  9. Anthony Tomasic, Hector Garcia-Molina: Issues in Parallel Information Retrieval. IEEE Data Eng. Bull. 17(3): 41-49(1994)
  10. Eric W. Brown, James P. Callan, W. Bruce Croft: Fast Incremental Indexing for Full-Text Information Retrieval. VLDB 1994: 192-202
  11. Anthony Tomasic, Hector Garcia-Molina, Kurt A. Shoens: Incremental Updates of Inverted Lists for Text Document Retrieval. SIGMOD Conference 1994: 289-300
  12. Alistair Moffat, Justin Zobel: Fast Ranking in Limited Space. ICDE 1994: 428-437
  13. Eric W. Brown, James P. Callan, W. Bruce Croft, J. Eliot B. Moss: Supporting Full-Text Information Retrieval with a Persistent Object Store. EDBT 1994: 365-378
  14. Anthony Tomasic, Hector Garcia-Molina: Query Processing and Inverted Indices in Shared-Nothing Document Information Retrieval Systems. VLDB J. 2(3): 243-275(1993)
  15. Justin Zobel, Alistair Moffat, Ron Sacks-Davis: Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files. VLDB 1993: 290-301
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:52 2009