Concepts and Effectiveness of the Cover-Coefficient-Based Clustering Methodology for Text Databases.
Fazli Can, Esen A. Ozkarahan:
Concepts and Effectiveness of the Cover-Coefficient-Based Clustering Methodology for Text Databases.
ACM Trans. Database Syst. 15(4): 483-517(1990)@article{DBLP:journals/tods/CanO90,
author = {Fazli Can and
Esen A. Ozkarahan},
title = {Concepts and Effectiveness of the Cover-Coefficient-Based Clustering
Methodology for Text Databases},
journal = {ACM Trans. Database Syst.},
volume = {15},
number = {4},
year = {1990},
pages = {483-517},
ee = {http://doi.acm.org/10.1145/99935.99938, db/journals/tods/CanO90.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
A new algorithm for document clustering is introduced. The base
concept of the algorithm, the cover coefficient (CC) concept,
provides a means of estimating the number of clusters within a
document database and relates indexing and clustering analytically.
The CC concept is used also to identify the cluster seeds and to
form clusters with these seeds. It is shown that the complexity of
the clustering process is very low. The retrieval experiments show
that the information-retrieval effectiveness of the algorithm is
compatible with a very demanding complete linkage clustering method
that is known to have good retrieval performance. The experiments
also show that the algorithm is 15.1 to 63.5 (with an average of
47.5) percent better than four other clustering algorithms in
cluster-based information retrieval. The experiments have validated
the indexing-clustering relationships and the complexity of the
algorithm and have shown improvements in retrieval effectiveness.
In the experiments, two document databases are used: TODS214 and
INSPEC. The latter is a common database with 12,684 documents.
Copyright © 1990 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 3 Issue 1, TODS 1976-1990" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
BibTeX
References
- [1]
- ...
- [2]
- ...
- [3]
- Fazli Can, Esen A. Ozkarahan:
A Clustering Scheme.
SIGIR 1983: 115-121 BibTeX
- [4]
- ...
- [5]
- Fazli Can, Esen A. Ozkarahan:
Concepts of the Cover-Coefficient-Based Clustering Methodology.
SIGIR 1985: 204-211 BibTeX
- [6]
- ...
- [7]
- ...
- [8]
- ...
- [9]
- ...
- [10]
- Abdelmoula El-Hamdouchi, Peter Willett:
Comparison of Hierarchie Agglomerative Clustering Methods for Document Retrieval.
Comput. J. 32(3): 220-227(1989) BibTeX
- [11]
- ...
- [12]
- ...
- [13]
- ...
- [14]
- ...
- [15]
- Anil K. Jain, Richard C. Dubes:
Algorithms for Clustering Data.
Prentice-Hall 1988
BibTeX
- [16]
- ...
- [17]
- ...
- [18]
- ...
- [19]
- Esen A. Ozkarahan, Fazli Can:
An Automatic and Tunable Document Indexing System.
SIGIR 1986: 234-243 BibTeX
- [20]
- Edie M. Rasmussen, Peter Willett:
Non-Hierarchic Document Clustering Using the ICL Distributed Array Processor.
SIGIR 1987: 132-139 BibTeX
- [21]
- ...
- [22]
- ...
- [23]
- Gerard Salton:
Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer.
Addison-Wesley 1989, ISBN 0-201-12227-8
BibTeX
- [24]
- Gerard Salton, Chris Buckley:
Term-Weighting Approaches in Automatic Text Retrieval.
Inf. Process. Manage. 24(5): 513-523(1988) BibTeX
- [25]
- Gerard Salton, Michael McGill:
Introduction to Modern Information Retrieval.
McGraw-Hill Book Company 1984, ISBN 0-07-054484-0
BibTeX
- [26]
- Gerard Salton, A. Wong:
Generation and Search of Clustered Files.
ACM Trans. Database Syst. 3(4): 321-346(1978) BibTeX
- [27]
- C. J. van Rijsbergen:
Information Retrieval.
Butterworth 1979, ISBN 0-408-70929-4
BibTeX
- [28]
- Ellen M. Voorhees:
The Cluster Hypothesis Revisited.
SIGIR 1985: 188-196 BibTeX
- [29]
- ...
- [30]
- ...
- [31]
- Ellen M. Voorhees:
The Efficiency of Inverted Index and Cluster Searches.
SIGIR 1986: 164-174 BibTeX
- [32]
- ...
- [33]
- S. Bing Yao:
Approximating the Number of Accesses in Database Organizations.
Commun. ACM 20(4): 260-261(1977) BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
TODS, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Tue Jun 24 18:39:09 2008