Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications.
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan:
Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications.
SIGMOD Conference 1998: 94-105@inproceedings{DBLP:conf/sigmod/AgrawalGGR98,
author = {Rakesh Agrawal and
Johannes Gehrke and
Dimitrios Gunopulos and
Prabhakar Raghavan},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {Automatic Subspace Clustering of High Dimensional Data for Data
Mining Applications},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-995-5},
pages = {94-105},
ee = {http://doi.acm.org/10.1145/276304.276314, db/conf/sigmod/AgrawalGGR98.html},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Data mining applications place special requirements on clustering
algorithms including:
the ability to find clusters embedded in subspaces of high dimensional
data, scalability, end-user comprehensibility of the results,
non-presumption of any canonical data distribution, and insensitivity
to the order of input records.
We present CLIQUE, a clustering algorithm that satisfies each of these
requirements.
CLIQUE identifies dense clusters in subspaces of maximum dimensionality.
It generates cluster descriptions in the form of DNF expressions that
are minimized for ease of comprehension. It produces identical results
irrespective of the order in which input records are presented and does
not presume any specific mathematical form for data distribution.
Through experiments, we show that CLIQUE efficiently finds accurate
clusters in large high dimensional datasets.
Copyright © 1998 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...
Online Version (ACM WWW Account required): Full Text in PDF Format
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Laura M. Haas, Ashutosh Tiwary (Eds.):
SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA.
ACM Press 1998, ISBN 0-89791-995-5 BibTeX
,
SIGMOD Record 27(2),
June 1998
Contents
[Abstract]
[Full Text (Postscript)]
References
- [1]
- ...
- [2]
- Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman:
The Design and Analysis of Computer Algorithms.
Addison-Wesley 1974, ISBN 0-201-00029-6
BibTeX
- [3]
- ...
- [4]
- ...
- [5]
- Roberto J. Bayardo Jr.:
Efficiently Mining Long Patterns from Databases.
SIGMOD Conference 1998: 85-93 BibTeX
- [6]
- Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel:
A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space.
PODS 1997: 78-86 BibTeX
- [7]
- ...
- [8]
- Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur:
Dynamic Itemset Counting and Implication Rules for Market Basket Data.
SIGMOD Conference 1997: 255-264 BibTeX
- [9]
- ...
- [10]
- ...
- [11]
- ...
- [12]
- ...
- [13]
- Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu:
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.
KDD 1996: 226-231 BibTeX
- [14]
- Martin Ester, Hans-Peter Kriegel, Xiaowei Xu:
A Database Interface for Clustering in Large Spatial Databases.
KDD 1995: 94-99 BibTeX
- [15]
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy (Eds.):
Advances in Knowledge Discovery and Data Mining.
AAAI/MIT Press 1996, ISBN 0-262-56097-6
Contents BibTeX
- [16]
- ...
- [17]
- ...
- [18]
- ...
- [19]
- ...
- [20]
- Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen:
Data mining, Hypergraph Transversals, and Machine Learning.
PODS 1997: 209-216 BibTeX
- [21]
- Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant:
Range Queries in OLAP Data Cubes.
SIGMOD Conference 1997: 73-88 BibTeX
- [22]
- ...
- [23]
- ...
- [24]
- ...
- [25]
- ...
- [26]
- Dao-I Lin, Zvi M. Kedem:
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set.
EDBT 1998: 105-119 BibTeX
- [27]
- ...
- [28]
- Carsten Lund, Mihalis Yannakakis:
On the hardness of approximating minimization problems.
STOC 1993: 286-293 BibTeX
- [29]
- ...
- [30]
- Manish Mehta, Rakesh Agrawal, Jorma Rissanen:
SLIQ: A Fast Scalable Classifier for Data Mining.
EDBT 1996: 18-32 BibTeX
- [31]
- ...
- [32]
- Renée J. Miller, Yuping Yang:
Association Rules over Interval Data.
SIGMOD Conference 1997: 452-461 BibTeX
- [33]
- Raymond T. Ng, Jiawei Han:
Efficient and Effective Clustering Methods for Spatial Data Mining.
VLDB 1994: 144-155 BibTeX
- [34]
- ...
- [35]
- ...
- [36]
- ...
- [37]
- John C. Shafer, Rakesh Agrawal, Manish Mehta:
SPRINT: A Scalable Parallel Classifier for Data Mining.
VLDB 1996: 544-555 BibTeX
- [38]
- ...
- [39]
- ...
- [40]
- Ramakrishnan Srikant, Rakesh Agrawal:
Mining Quantitative Association Rules in Large Relational Tables.
SIGMOD Conference 1996: 1-12 BibTeX
- [41]
- Hannu Toivonen:
Sampling Large Databases for Association Rules.
VLDB 1996: 134-145 BibTeX
- [42]
- ...
- [43]
- ...
- [44]
- ...
- [45]
- Tian Zhang, Raghu Ramakrishnan, Miron Livny:
BIRCH: An Efficient Data Clustering Method for Very Large Databases.
SIGMOD Conference 1996: 103-114 BibTeX
Referenced by
- Anthony K. H. Tung, Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han:
Constraint-based clustering in large databases.
ICDT 2001: 405-419
- Gholamhosein Sheikholeslami, Surojit Chatterjee, Aidong Zhang:
WaveCluster: A Wavelet Based Clustering Approach for Spatial Data in Very Large Databases.
VLDB J. 8(3-4): 289-304(2000)
- Theodore Johnson, Laks V. S. Lakshmanan, Raymond T. Ng:
The 3W Model and Algebra for Unified Data Mining.
VLDB 2000: 21-32
- Kaushik Chakrabarti, Sharad Mehrotra:
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces.
VLDB 2000: 89-100
- Carlos Ordonez, Paul Cereghini:
SQLEM: Fast Clustering in SQL using the EM Algorithm.
SIGMOD Conference 2000: 559-570
- Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander:
LOF: Identifying Density-Based Local Outliers.
SIGMOD Conference 2000: 93-104
- Charu C. Aggarwal, Philip S. Yu:
Finding Generalized Projected Clusters In High Dimensional Spaces.
SIGMOD Conference 2000: 70-81
- Edwin M. Knorr, Raymond T. Ng:
Finding Intensional Knowledge of Distance-Based Outliers.
VLDB 1999: 211-222
- Alexander Hinneburg, Daniel A. Keim:
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering.
VLDB 1999: 506-517
- H. V. Jagadish, J. Madar, Raymond T. Ng:
Semantic Compression and Pattern Extraction with Fascicles.
VLDB 1999: 186-198
- Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander:
OPTICS: Ordering Points To Identify the Clustering Structure.
SIGMOD Conference 1999: 49-60
- Charu C. Aggarwal, Cecilia Magdalena Procopiuc, Joel L. Wolf, Philip S. Yu, Jong Soo Park:
Fast Algorithms for Projected Clustering.
SIGMOD Conference 1999: 61-72
- Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan:
A Framework for Measuring Changes in Data Characteristics.
PODS 1999: 126-137
- Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L. Powell, James C. French:
Clustering Large Datasets in Arbitrary Metric Spaces.
ICDE 1999: 502-511
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:42 2009