ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

Creating Segmented Databases from Free Text for Text Retrieval.

Lisa F. Rau, Paul S. Jacobs: Creating Segmented Databases from Free Text for Text Retrieval. SIGIR 1991: 337-346
@inproceedings{DBLP:conf/sigir/RauJ91,
  author    = {Lisa F. Rau and
               Paul S. Jacobs},
  editor    = {Abraham Bookstein and
               Yves Chiaramella and
               Gerard Salton and
               Vijay V. Raghavan},
  title     = {Creating Segmented Databases from Free Text for Text Retrieval},
  booktitle = {Proceedings of the 14th Annual International ACM SIGIR Conference
               on Research and Development in Information Retrieval. Chicago,
               Illinois, USA, October 13-16, 1991 (Special Issue of the SIGIR
               Forum)},
  publisher = {ACM},
  year      = {1991},
  isbn      = {0-89791-448-1},
  pages     = {337-346},
  ee        = {db/conf/sigir/RauJ91.html},
  crossref  = {DBLP:conf/sigir/91},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Indexing text for accurate retrieval is a difficult and important problem. On-line information services generally depend on "keyword" indices rather than other methods of retrieval, because of the practical features of keywords for storage, dissemination and browsing as well as for retrieval. However, these methods of indexing have two major drawbacks: First, they must be laboriously assigned by human indexers. Second, they are unaccurate, because of mistakes made by these indezers as well as the difficulties users have in choosing keywords for their queries, and the ambiguity a keyword may have.

Current natural language text processing (AILP) methods help to overcome these problems. Such methods can provide automatic indezing and keyword assigment capabilities that are at least as accurate as human indezers in many applications. In addition, NLP systems can increase the information contained in keyword fields by separating keywords into segments, or distinct fields that capture certain discriminating content or relations among keywords.

This paper reports on a system that uses natural language text processing to derive keywords from free text news stories, separate these keywords into segments, and automatically build a segmented database. The system is used as part of a commercial news "clipping" and retrieval product. Preliminary results show improved accuracy, as well as reduced costs resulting from these automated techniques.

Copyright © 1991 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Abraham Bookstein, Yves Chiaramella, Gerard Salton, Vijay V. Raghavan (Eds.): Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Chicago, Illinois, USA, October 13-16, 1991 (Special Issue of the SIGIR Forum). ACM 1991, ISBN 0-89791-448-1
Contents BibTeX

Online Edition: ACM Digital Library

Citation page
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:39 2009