ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

Using WordNet to Disambiguate Word Senses for Text Retrieval.

Ellen M. Voorhees: Using WordNet to Disambiguate Word Senses for Text Retrieval. SIGIR 1993: 171-180
@inproceedings{DBLP:conf/sigir/Voorhees93,
  author    = {Ellen M. Voorhees},
  editor    = {Robert Korfhage and
               Edie M. Rasmussen and
               Peter Willett 0002},
  title     = {Using WordNet to Disambiguate Word Senses for Text Retrieval},
  booktitle = {Proceedings of the 16th Annual International ACM-SIGIR Conference
               on Research and Development in Information Retrieval. Pittsburgh,
               PA, USA, June 27 - July 1, 1993},
  publisher = {ACM},
  year      = {1993},
  isbn      = {0-89791-605-0},
  pages     = {171-180},
  ee        = {db/conf/sigir/Voorhees93.html},
  crossref  = {DBLP:conf/sigir/93},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

This paper describes an automatic indexing procedure that uses the "IS-A" relations contained within WordNet and the set of nouns contained in a text to select a sense for each plysemous noun in the text. The result of the indexing procedure is a vector in which some of the terms represent word senses instead of word stems. Retrieval experiments comparing the effectivenss of these sense-based vectors vs. stem-based vectors show the stem-based vectors to be superior overall, although the sense-based vectors do improve the performance of some queries. The overall degradation is due in large part to the difficulty of disambiguating senses in short query statements. An analysis of these results suggests two conclusions: the IS-A links define a generalization/specialization hierarchy that is not sufficient to reliably select the correct sense of a noun from the set of fine sense distinctions in WordNet; and missing correct matches because of incorrect sense resolution has a much more deleterious effect on retrieval performance than does making spurious matches.

Copyright © 1993 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Robert Korfhage, Edie M. Rasmussen, Peter Willett (Eds.): Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27 - July 1, 1993. ACM 1993, ISBN 0-89791-605-0
Contents BibTeX

Online Edition: ACM Digital Library

Citation page

Referenced by

  1. Soumen Chakrabarti, Byron Dom, Rakesh Agrawal, Prabhakar Raghavan: Scalable Feature Selection, Classification and Signature Generation for Organizing Large Text Databases into Hierarchical Topic Taxonomies. VLDB J. 7(3): 163-178(1998)
  2. Soumen Chakrabarti, Byron Dom, Rakesh Agrawal, Prabhakar Raghavan: Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases. VLDB 1997: 446-455
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:42 2009