Analysis of Multiterm Queries in a Dynamic Signature File Organization.
Deniz Aktug, Fazli Can:
Analysis of Multiterm Queries in a Dynamic Signature File Organization.
SIGIR 1993: 96-105@inproceedings{DBLP:conf/sigir/AktugC93,
author = {Deniz Aktug and
Fazli Can},
editor = {Robert Korfhage and
Edie M. Rasmussen and
Peter Willett 0002},
title = {Analysis of Multiterm Queries in a Dynamic Signature File Organization},
booktitle = {Proceedings of the 16th Annual International ACM-SIGIR Conference
on Research and Development in Information Retrieval. Pittsburgh,
PA, USA, June 27 - July 1, 1993},
publisher = {ACM},
year = {1993},
isbn = {0-89791-605-0},
pages = {96-105},
ee = {db/conf/sigir/AktugC93.html},
crossref = {DBLP:conf/sigir/93},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Our analysis combines the concerns of signature extraction and signature file organization which
have usually been treated as separate issues. We also relax the uniform frequency and single term
query assumptions and provide a comprehensive analysis for multiterm query environments where
terms can be classified based on their query and database occurrence frequencies. The performance
of three superimposed signature generation schemes is explored as they are applied to one dynamic
signature file organization based on linear hashing: Linear Hashing with Superimposed Signatures
(LHSS). First scheme (SM) allows all terms set the same number of bits regardless of their
discriminatory power whereas the second and third methods (MMS and MMM) emphasize the terms
with high query and low database occurrence frequencies. Of these three schemes, only MMM takes
the probability distribution of the number of query terms into account in finding the optimal mapping
strategy. Derivation of performance evaluation formulas is provided together with the results of
various experimental settings. Suggestions as to how to implement the given techniques in real life
cases are also provided. Results indicate that MMM outperforms the other methods as the gap
between the discriminatory power of the terms gets larger. The absolute value of the savings
provided by MMM reach a maximum for the high query weight case. However, the extra savings
decline sharply for high weight and moderately for the low weight queries with the increase in
database size.
Copyright © 1993 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Robert Korfhage, Edie M. Rasmussen, Peter Willett (Eds.):
Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27 - July 1, 1993.
ACM 1993, ISBN 0-89791-605-0
Contents BibTeX
Citation page
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:42 2009