ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy.

Takahiko Shintani, Masaru Kitsuregawa: Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy. SIGMOD Conference 1998: 25-36
@inproceedings{DBLP:conf/sigmod/ShintaniK98,
  author    = {Takahiko Shintani and
               Masaru Kitsuregawa},
  editor    = {Laura M. Haas and
               Ashutosh Tiwary},
  title     = {Parallel Mining Algorithms for Generalized Association Rules
               with Classification Hierarchy},
  booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
               on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
  publisher = {ACM Press},
  year      = {1998},
  isbn      = {0-89791-995-5},
  pages     = {25-36},
  ee        = {http://doi.acm.org/10.1145/276304.276308, db/conf/sigmod/ShintaniK98.html},
  crossref  = {DBLP:conf/sigmod/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Association rule mining recently attracted strong attention. Usually, the classification hierarchy over the data items is available. Users are interested in generalized association rules that span different levels of the hierarchy, since sometimes more interesting rules can be derived by taking the hierarchy into account.

In this paper, we propose the new parallel algorithms for mining association rules with classification hierarchy on a shared-nothing parallel machine to improve its performance. Our algorithms partition the candidate itemsets over the processors, which exploits the aggregate memory of the system effectively. If the candidate itemsets are partitioned without considering classification hierarchy, both the items and its all the ancestor items have to be transmitted, that causes prohibitively large amount of communications. Our method minimizes interprocessor communication by considering the hierarchy. Moreover, in our algorithm, the available memory space is fully utilized by identifying the frequently occurring candidate itemsets and copying them over all the processors, through which frequent itemsets can be processed locally without any communication. Thus it can effectively reduce the load skew among the processors. Several experiments are done by changing the granule of copying itemsets, from the whole tree, to the small group of the frequent itemsets along the hierarchy. The coarser the grain, the easier the control but it is rather difficult to achieve the sufficient load balance. The finer the grain, the more complicated the control is required but it can balance the load quite well.

We implemented proposed algorithms on IBM SP-2. Performance evaluations show that our algorithms are effective for handling skew and attain sufficient speedup ratio.

Copyright © 1998 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ... Online Version (ACM WWW Account required): Full Text in PDF Format

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Laura M. Haas, Ashutosh Tiwary (Eds.): SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA. ACM Press 1998, ISBN 0-89791-995-5 BibTeX , SIGMOD Record 27(2), June 1998
Contents

Online Edition: ACM SIGMOD

[Abstract]
[Full Text (Postscript)]

References

[AS96]
Rakesh Agrawal, John C. Shafer: Parallel Mining of Association Rules. IEEE Trans. Knowl. Data Eng. 8(6): 962-969(1996) BibTeX
[CHN+96]
David Wai-Lok Cheung, Jiawei Han, Vincent T. Y. Ng, Ada Wai-Chee Fu, Yongjian Fu: A Fast Distributed Algorithm for Mining Association Rules. PDIS 1996: 31-42 BibTeX
[CNFF96]
David Wai-Lok Cheung, Vincent T. Y. Ng, Ada Wai-Chee Fu, Yongjian Fu: Efficient Mining of Association Rules in Distributed Databases. IEEE Trans. Knowl. Data Eng. 8(6): 911-922(1996) BibTeX
[HKK97]
Eui-Hong Han, George Karypis, Vipin Kumar: Scalable Parallel Data Mining for Association Rules. SIGMOD Conference 1997: 277-288 BibTeX
[PCY95]
Jong Soo Park, Ming-Syan Chen, Philip S. Yu: Efficient Parallel and Data Mining for Association Rules. CIKM 1995: 31-36 BibTeX
[RR94]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[SA95]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419 BibTeX
[SA96]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Sequential Patterns: Generalizations and Performance Improvements. EDBT 1996: 3-17 BibTeX
[SK96]
Takahiko Shintani, Masaru Kitsuregawa: Hash Based Parallel Algorithms for Mining Association Rules. PDIS 1996: 19-30 BibTeX
[SK98]
...

Referenced by

  1. Masahisa Tamura, Masaru Kitsuregawa: Dynamic Load Balancing for Parallel Association Rule Mining on Heterogenous PC Cluster Systems. VLDB 1999: 162-173
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:41 2009