ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Clustering Categorical Data: An Approach Based on Dynamical Systems.

David Gibson, Jon M. Kleinberg, Prabhakar Raghavan: Clustering Categorical Data: An Approach Based on Dynamical Systems. VLDB J. 8(3-4): 222-236(2000)
@article{DBLP:journals/vldb/GibsonKR00,
  author    = {David Gibson and
               Jon M. Kleinberg and
               Prabhakar Raghavan},
  title     = {Clustering Categorical Data: An Approach Based on Dynamical Systems},
  journal   = {VLDB J.},
  volume    = {8},
  number    = {3-4},
  year      = {2000},
  pages     = {222-236},
  ee        = {db/journals/vldb/GibsonKR00.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By "categorical data," we mean tables with fields that cannot be naturally ordered by a metric - e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems.

Key Words

Clustering - Data mining - Categorial data - Dynamical systems - Hypergraphs

Copyright © 2000 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 5 Issue 2, JACM, VLDB-J, POS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ... BibTeX

References

[1]
Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen, A. Inkeri Verkamo: Fast Discovery of Association Rules. Advances in Knowledge Discovery and Data Mining 1996: 307-328 BibTeX
[2]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
[3]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[4]
...
[5]
...
[6]
Noga Alon, Joel Spencer: The Probabilistic Method. John Wiley 1992, ISBN 0-471-53588-5
Contents BibTeX
[7]
...
[8]
Avrim Blum, Joel Spencer: Coloring Random and Semi-Random k-Colorable Graphs. J. Algorithms 19(2): 204-234(1995) BibTeX
[9]
Ravi B. Boppana: Eigenvalues and Graph Bisection: An Average-Case Analysis (Extended Abstract). FOCS 1987: 280-285 BibTeX
[10]
Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur: Dynamic Itemset Counting and Implication Rules for Market Basket Data. SIGMOD Conference 1997: 255-264 BibTeX
[11]
Sergey Brin, Rajeev Motwani, Craig Silverstein: Beyond Market Baskets: Generalizing Association Rules to Correlations. SIGMOD Conference 1997: 265-276 BibTeX
[12]
Sergey Brin, Lawrence Page: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks 30(1-7): 107-117(1998) BibTeX
[13]
...
[14]
Tzi-cker Chiueh: Content-Based Image Indexing. VLDB 1994: 582-593 BibTeX
[15]
...
[16]
Gautam Das, Heikki Mannila, Pirjo Ronkainen: Similarity of Attributes by External Probes. KDD 1998: 23-29 BibTeX
[17]
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, Richard A. Harshman: Indexing by Latent Semantic Analysis. JASIS 41(6): 391-407(1990) BibTeX
[18]
...
[19]
...
[20]
...
[21]
...
[22]
...
[23]
Myron Flickner, Harpreet S. Sawhney, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, Peter Yanker: Query by Image and Video Content: The QBIC System. IEEE Computer 28(9): 23-32(1995) BibTeX
[24]
M. R. Garey, David S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
BibTeX
[25]
...
[26]
Eui-Hong Han, George Karypis, Vipin Kumar, Bamshad Mobasher: Clustering Based On Association Rule Hypergraphs. DMKD 1997: 0- BibTeX
[27]
...
[28]
...
[29]
Zhexue Huang: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. DMKD 1997: 0- BibTeX
[30]
...
[31]
...
[32]
Jon M. Kleinberg: Authoritative Sources in a Hyperlinked Environment. J. ACM 46(5): 604-632(1999) BibTeX
[33]
...
[34]
...
[35]
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo: Discovering Frequent Episodes in Sequences. KDD 1995: 210-215 BibTeX
[36]
...
[37]
...
[38]
...
[39]
Daniel A. Spielman, Shang-Hua Teng: Spectral Partitioning Works: Planar Graphs and Finite Element Meshes. FOCS 1996: 96-105 BibTeX
[40]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419 BibTeX
[41]
Hannu Toivonen: Sampling Large Databases for Association Rules. VLDB 1996: 134-145 BibTeX
[42]
...
[43]
Tian Zhang, Raghu Ramakrishnan, Miron Livny: BIRCH: An Efficient Data Clustering Method for Very Large Databases. SIGMOD Conference 1996: 103-114 BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Journal: 1992-1995 Copyright © by VLDB Endowment / 1996-... Copyright © by Springer Verlag,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sun May 17 00:31:37 2009