ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

A Parallel Indexed Algorithm for Information Retrieval.

Craig Stanfill, Robert Thau, David L. Waltz: A Parallel Indexed Algorithm for Information Retrieval. SIGIR 1989: 88-97
@inproceedings{DBLP:conf/sigir/StanfillTW89,
  author    = {Craig Stanfill and
               Robert Thau and
               David L. Waltz},
  editor    = {Nicholas J. Belkin and
               C. J. van Rijsbergen},
  title     = {A Parallel Indexed Algorithm for Information Retrieval},
  booktitle = {SIGIR'89, 12th International Conference on Research and Development
               in Information Retrieval, Cambridge, Massachusetts, USA, June
               25-28, 1989, Proceedings},
  publisher = {ACM},
  year      = {1989},
  isbn      = {0-89791-321-3},
  pages     = {88-97},
  ee        = {db/conf/sigir/StanfillTW89.html},
  crossref  = {DBLP:conf/sigir/89},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In this paper we present a parallel document ranking algorithm suitable for use on databases of 1-1000 GB, resident on primary or secondary storage. The algorithm is based on inverted indexes, and has two advantages over a previously published parallel algorithm for retrieval based on signature files. First, it permits the employment of ranking strategies which cannot be easily implemented using signature files, specifically methods which depend on document-term weighting. Second, it permits the interactive searching of databases resident on secondary storage. The algorithm is evaluated via a mixture of analytic and simulation techniques, with a particular focus on how cost-effectiveness and efficiency change as the size of the database, number of processors, and cost of memory are altered. In particular, we find that if the ratio of the number of processors and/or disks to the size of the database is held constant, then the cost-effectiveness of the resulting system remains constant. Furthermore, for a given size of database, there is a number of processors which optimizes cost-effectiveness. Estimated response times are also presented. Using these methods, it appears that cost-effective interactive access to databases in the 100-1000 GB range can be achieved using current technology.

Copyright © 1989 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Nicholas J. Belkin, C. J. van Rijsbergen (Eds.): SIGIR'89, 12th International Conference on Research and Development in Information Retrieval, Cambridge, Massachusetts, USA, June 25-28, 1989, Proceedings. ACM 1989, ISBN 0-89791-321-3
Contents BibTeX

Online Edition: ACM Digital Library

Citation page

Referenced by

  1. Anthony Tomasic, Hector Garcia-Molina: Issues in Parallel Information Retrieval. IEEE Data Eng. Bull. 17(3): 41-49(1994)
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:35 2009