ACM SIGMOD Anthology TODS dblp.uni-trier.de

On Modeling of Information Retrieval Concepts in Vector Space.

S. K. Michael Wong, Wojciech Ziarko, Vijay V. Raghavan, P. C. N. Wong: On Modeling of Information Retrieval Concepts in Vector Space. ACM Trans. Database Syst. 12(2): 299-321(1987)
@article{DBLP:journals/tods/WongZRW87,
  author    = {S. K. Michael Wong and
               Wojciech Ziarko and
               Vijay V. Raghavan and
               P. C. N. Wong},
  title     = {On Modeling of Information Retrieval Concepts in Vector Space},
  journal   = {ACM Trans. Database Syst.},
  volume    = {12},
  number    = {2},
  year      = {1987},
  pages     = {299-321},
  ee        = {http://doi.acm.org/10.1145/22952.22957, db/journals/tods/WongZRW87.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The Vector Space Model (VSM) has been adopted in information retrieval as a means of coping with inexact representation of documents and queries, and the resulting difficulties in determining the relevance of a document relative to a given query. The major problem in employing this approach is that the explicit representation of term vectors is not known a priori. Consequently, earlier researchers made the assumption that the vectors corresponding to terms are pairwise orthogonal. Such an assumption is clearly unrealistic. Although attempts have been made to compensate for this assumption by some separate, corrective steps, such methods are ad hoc and, in most cases, formally inconsistent.

In this paper, a generalization of the VSM, called the GVSM, is advanced. The developments provide a solution not only for the computation of a measure of similarity (correlation) between terms, but also for the incorporation of these similarities into the retrieval process.

The major strength of the GVSM derives from the fact that it is theoretically sound and elegant. Furthermore, experimental evaluation of the model on several test collections indicates that the performance is better than that of the VSM. Experiments have been performed on some variations of the GVSM, and all these results have also been compared to those of the VSM, based on inverse document frequency weighting. These results and some ideas for the efficient implementation of the GVSM are discussed.

Copyright © 1987 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Joint ACM SIGMOD / IEEE Computer Society Anthology

CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
M. Gordon: A Learning Algorithm Applied to Document Description. SIGIR 1985: 179-185 BibTeX
[2]
...
[3]
...
[4]
...
[5]
Vijay V. Raghavan, Clement T. Yu: Experiments on the Determination of the Relationships Between Terms. ACM Trans. Database Syst. 4(2): 240-260(1979) BibTeX
[6]
Gerard Salton, Michael Lesk: Computer Evaluation of Indexing and Text Processing. J. ACM 15(1): 8-36(1968) BibTeX
[7]
...
[8]
...
[9]
...
[10]
Gerard Salton, Chris Buckley, Clement T. Yu: An Evaluation of Term Dependence Models in Information Retrieval. SIGIR 1982: 151-173 BibTeX
[11]
Gerard Salton, Michael McGill: Introduction to Modern Information Retrieval. McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
BibTeX
[12]
...
[13]
...
[14]
S. K. Michael Wong, Wojciech Ziarko, Vijay V. Raghavan, P. C. N. Wong: On Extending the Vector Space Model for Boolean Query Processing. SIGIR 1986: 175-185 BibTeX
[15]
S. K. Michael Wong, Wojciech Ziarko, P. C. N. Wong: Generalized Vector Space Model in Information Retrieval. SIGIR 1985: 18-25 BibTeX
[16]
...

Referenced by

  1. Lawrence V. Saxton, Vijay V. Raghavan: Design of an Integrated Information Retrieval/Database Management System. IEEE Trans. Knowl. Data Eng. 2(2): 210-219(1990)
  2. Norbert Fuhr: A Probabilistic Framework for Vague Queries and Imprecise Information in Databases. VLDB 1990: 696-707
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:39:01 2008