ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Adapting a Spatial Access Structure for Document Representations in Vector Space.

Andreas Henrich: Adapting a Spatial Access Structure for Document Representations in Vector Space. CIKM 1996: 19-26
@inproceedings{DBLP:conf/cikm/Henrich96,
  author    = {Andreas Henrich},
  title     = {Adapting a Spatial Access Structure for Document Representations
               in Vector Space},
  booktitle = {CIKM '96, Proceedings of the Fifth International Conference on
               Information and Knowledge Management, November 12 - 16, 1996,
               Rockville, Maryland, USA},
  publisher = {ACM},
  year      = {1996},
  pages     = {19-26},
  ee        = {db/conf/cikm/Henrich96.html, http://doi.acm.org/10.1145/238355.238367},
  crossref  = {DBLP:conf/cikm/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In the field of information-retrieval the vector space model has been proposed. In this model queries and documents are represented as term vectors where each coefficient represents the relevance of a given term with respect to the document or query. A typical task in this context is to search for the documents most similar to a given query vector.

On the other hand, algorithms to perform nearest neighbor and distance scan queries have been proposed for various types of spatial access structures. Unfortunately, these access structures assume implicitly that the number of dimensions is relatively small - which is not the case for document representation vectors.

In this paper we discuss the adaptation of spatial access structures for document representation vectors. We describe how some peculiarities of document representation vectors can be exploited to overcome the problems with higher dimensions to a certain extend. We exploit these peculiarities introducing a new cluster split technique and a sophisticated algorithm to calculate an upper bound for the similarity of the documents located in a subtree of the access structure.

Copyright © 1996 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

CIKM '96, Proceedings of the Fifth International Conference on Information and Knowledge Management, November 12 - 16, 1996, Rockville, Maryland, USA. ACM 1996
Contents BibTeX

Online Edition

Citation Page BibTeX

Referenced by

  1. Andreas Henrich: The LSDh-Tree: An Access Structure for Feature Vectors. ICDE 1998: 362-369
  2. Jochen Van den Bercken, Bernhard Seeger, Peter Widmayer: A Generic Approach to Bulk Loading Multidimensional Index Structures. VLDB 1997: 406-415
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
CIKM 1996 Proceedings, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:01:52 2009