Digital Symposium Collection 2000  

 
 
 
 
 
 

 





















Dynamic Load Balancing for Parallel Association Rule Mining on Heterogenous PC Cluster Systems

Masahisa Tamura and Masaru Kitsuregawa

  View Paper (PDF)  

Return to Data Mining Algorithms

Abstract
The dynamic load balancing strategies for parallel association rule mining are proposed under heterogeneous PC cluster environment. PC cluster is recently regarded as one of the most promising platforms for heavy data intensive applications, such as decision support query processing and data mining. The development period of PC hardware is becoming extremely short, which results in heterogeneous system, where the clock cycle of CPU, the performance/capacity of disk drives, etc are different among component PC's. Heterogeneity is inevitable. Basically, current algorithms assume the homogeneity. Thus if we naively apply them to heterogeneous system, its performance is far below expectation. We need some new methodologies to handle heterogeneity. In this paper, we propose the new dynamic load balancing methods for association rule mining, which works under heterogeneous system. Two strategies, called candidate migration and transaction migration are proposed. Initially first one is invoked. When the load imbalance cannot be resolved with the first method, the second one is employed, which is costly but more effective for strong imbalance. We have implemented them on the PC cluster system with two different types of PCs: one with Pentium Pro, the other one with Pentium II. The experimental results confirm that the proposed approach can very effectively balance the workload among heterogeneous PCs.


References

Note: References link to DBLP on the Web.

[1]
Rakesh Agrawal , Ramakrishnan Srikant : Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994 : 487-499
[2]
Rakesh Agrawal , John C. Shafer : Parallel Mining of Association Rules. TKDE 8(6) : 962-969(1996)
[3]
Beowulf Project at CESDIS. http://beowulf.gsfc.nasa.gov/beowulf.html
[4]
David Wai-Lok Cheung , Jiawei Han , Vincent Ng , Ada Wai-Chee Fu , Yongjian Fu : A Fast Distributed Algorithm for Mining Association Rules. PDIS 1996 : 31-42
[5]
Hasanat M. Dewan , Mauricio A. Hernández , Kui W. Mok , Salvatore J. Stolfo : Predictive Dynamic Load Balancing of Parallel Hash-Joins Over Heterogeneous Processors in the Presence of Data Skew. PDIS 1994 : 40-49
[6]
David J. DeWitt , Jim Gray : Parallel Database Systems: The Future of High Performance Database Systems. CACM 35(6) : 85-98(1992)
[7]
Eui-Hong Han , George Karypis , Vipin Kumar : Scalable Parallel Data Mining for Association Rules. SIGMOD Conference 1997 : 277-288
[8]
...
[9]
Jong Soo Park , Ming-Syan Chen , Philip S. Yu : Efficient Parallel and Data Mining for Association Rules. CIKM 1995 : 31-36
[10]
Srinivasan Parthasarathy , Mohammed Javeed Zaki , Wei Li : Memory Placement Techniques for Parallel Association Mining. KDD 1998 : 304-308
[11]
Takahiko Shintani , Masaru Kitsuregawa : Hash Based Parallel Algorithms for Mining Association Rules. PDIS 1996 : 19-30
[12]
Takahiko Shintani , Masaru Kitsuregawa : Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy. SIGMOD Conference 1998 : 25-36
[13]
...
[14]
David Wai-Lok Cheung , Yongqiao Xiao : Effect of Data Skewness in Parallel Mining of Association Rules. PAKDD 1998 : 48-60
[15]
Mohammed Javeed Zaki , Srinivasan Parthasarathy , Mitsunori Ogihara , Wei Li : New Algorithms for Fast Discovery of Association Rules. KDD 1997 : 283-286
[16]
Philip A. Bernstein , Michael L. Brodie , Stefano Ceri , David J. DeWitt , Michael J. Franklin , Hector Garcia-Molina , Jim Gray , Gerald Held , Joseph M. Hellerstein , H. V. Jagadish , Michael Lesk , David Maier , Jeffrey F. Naughton , Hamid Pirahesh , Michael Stonebraker , Jeffrey D. Ullman : The Asilomar Report on Database Research. SIGMOD Record 27(4) : 74-80(1998)

BIBTEX

@inproceedings{DBLP:conf/vldb/TamuraK99,
  author    = {Masahisa Tamura and
                Masaru Kitsuregawa},
   editor    = {Malcolm P. Atkinson and
                Maria E. Orlowska and
                Patrick Valduriez and
                Stanley B. Zdonik and
                Michael L. Brodie},
   title     = {Dynamic Load Balancing for Parallel Association Rule Mining on
                Heterogenous PC Cluster Systems},
   booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
                Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
                UK},
   publisher = {Morgan Kaufmann},
   year      = {1999},
   isbn      = {1-55860-615-5},
   pages     = {162-173},
   crossref  = {DBLP:conf/vldb/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM