Fast Algorithms for projected clustering

Charu C. Aggarwal*       Cecilia Procopiuc       Joel L. Wolf
IBM T J watson research center       Duke University       IBM T J Watson Research Center
charu@watson.ibm.com            

Philip S. Yu       Jong Soo Park
IBM T J Watson Research Center       IBM T J Watson Research Center
     

Abstract

The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. The projected clustering problem is a a generalization of the clustering problem, in which the subsets of dimensions selected are specific to the clusters. We develop fast algorithms for solving the projected clustering problem, and test its performance with respect to other recently proposed methods in the literature.