Background for Association Rules and Cost Estimate of Selected Mining Algorithms.
Jia Liang Han, Ashley W. Plank:
Background for Association Rules and Cost Estimate of Selected Mining Algorithms.
CIKM 1996: 73-80@inproceedings{DBLP:conf/cikm/HanP96,
author = {Jia Liang Han and
Ashley W. Plank},
title = {Background for Association Rules and Cost Estimate of Selected
Mining Algorithms},
booktitle = {CIKM '96, Proceedings of the Fifth International Conference on
Information and Knowledge Management, November 12 - 16, 1996,
Rockville, Maryland, USA},
publisher = {ACM},
year = {1996},
pages = {73-80},
ee = {db/conf/cikm/HanP96.html, http://doi.acm.org/10.1145/238355.238440},
crossref = {DBLP:conf/cikm/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Association rules may be used to represent regular patterns in
databases for the purpose of decision support applications. Fast
algorithms for mining association rules have been proposed and studied
experimentally in the literature. A key to the algorithms is to find
large itemsets, i.e., sets of items that are well supported in the
database. In this paper, we study association rules from a
statistical viewpoint. Assuming statistical independence, we obtain
the expected support of an itemset, which we regard as the background.
Fluctuation of support of an itemset is noise unless it differs
significantly from the background. We introduce the concept of
clusters, which are the largest large itemsets. From clusters in a
database, we may estimate the number of large itemsets and candidate
itemsets (intermediate results), which are important to the space
complexity. We consider computation costs of Apriori, AprioriTid,
AprioriHybrid, OCD, SETM and DHP algorithms and study their
scalability. Our study suggests that the key to costs and scalability
is the space complexity of large itemsets and candidate itemsets. If
the size of candidate k-itemsets is less than main memory, then the
above algorithms scale up well. If the size of candidate k-itemsets
is larger than main memory, however, the costs increase very rapidly.
Copyright © 1996 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
CIKM '96, Proceedings of the Fifth International Conference on Information and Knowledge Management, November 12 - 16, 1996, Rockville, Maryland, USA.
ACM 1996
Contents BibTeX
Online Edition
Citation Page
BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
CIKM 1996 Proceedings, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:01:52 2009