@inproceedings{DBLP:conf/sigir/Sanderson94, author = {Mark Sanderson}, editor = {W. Bruce Croft and C. J. van Rijsbergen}, title = {Word Sense Disambiguation and Information Retrieval}, booktitle = {Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum)}, publisher = {ACM/Springer}, year = {1994}, isbn = {3-540-19889-X}, pages = {142-151}, ee = {db/conf/sigir/Sanderson94.html}, crossref = {DBLP:conf/sigir/94}, bibsource = {DBLP, http://dblp.uni-trier.de} }BibTeX
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval (IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance wilI increase. However, recent research into the application of a word sense disambiguator to an IR system failed to show any performance increase. From these results it has become clear that more basic research is needed to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of accuracy.
Copyright © 1994 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.