ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Exploratory Mining and Pruning Optimizations of Constrained Association Rules.

Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han, Alex Pang: Exploratory Mining and Pruning Optimizations of Constrained Association Rules. SIGMOD Conference 1998: 13-24
@inproceedings{DBLP:conf/sigmod/NgLHP98,
  author    = {Raymond T. Ng and
               Laks V. S. Lakshmanan and
               Jiawei Han and
               Alex Pang},
  editor    = {Laura M. Haas and
               Ashutosh Tiwary},
  title     = {Exploratory Mining and Pruning Optimizations of Constrained Association
               Rules},
  booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
               on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
  publisher = {ACM Press},
  year      = {1998},
  isbn      = {0-89791-995-5},
  pages     = {13-24},
  ee        = {http://doi.acm.org/10.1145/276304.276307, db/conf/sigmod/NgLHP98.html},
  crossref  = {DBLP:conf/sigmod/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

From the standpoint of supporting human-centered discovery of knowledge, the present-day model of mining association rules suffers from the following serious shortcomings: (i) lack of user exploration and control, (ii) lack of focus, and (iii) rigid notion of relationships. In effect, this model functions as a black-box, admitting little user interaction in between. We propose, in this paper, an architecture that opens up the black-box, and supports constraint-based, human-centered exploratory mining of associations. The foundation of this architecture is a rich set of constraint constructs, including domain, class, and sql-style aggregate constraints, which enable users to clearly specify what associations are to be mined. We propose ``constrained association queries'' as a means of specifying the constraints to be satisfied by the antecedent and consequent of a mined association.
In this paper, we mainly focus on the technical challenges in guaranteeing a level of performance that is commensurate with the selectivities of the constraints in an association query. To this end, we introduce and analyze two properties of constraints that are critical to pruning: ``anti-monotonicity'' and ``succinctness''. We then develop characterizations of various constraints into four categories, according to these properties. Finally, we describe a mining algorithm called CAP, which achieves a maximized degree of pruning for all categories of constraints. Experimental results indicate that CAP can run much faster, in some cases as much as 80 times, than several basic algorithms. This demonstrates how important the succinctness and anti-monotonicity properties are, in delivering the performance guarantee.

Copyright © 1998 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ... Online Version (ACM WWW Account required): Full Text in PDF Format

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Laura M. Haas, Ashutosh Tiwary (Eds.): SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA. ACM Press 1998, ISBN 0-89791-995-5 BibTeX , SIGMOD Record 27(2), June 1998
Contents

Online Edition: ACM SIGMOD

[Abstract]
[Full Text (Postscript)]

References

[1]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
[2]
Rakesh Agrawal, John C. Shafer: Parallel Mining of Association Rules. IEEE Trans. Knowl. Data Eng. 8(6): 962-969(1996) BibTeX
[3]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[4]
Elena Baralis, Giuseppe Psaila: Designing Templates for Mining Association Rules. J. Intell. Inf. Syst. 9(1): 7-32(1997) BibTeX
[5]
Sergey Brin, Rajeev Motwani, Craig Silverstein: Beyond Market Baskets: Generalizing Association Rules to Correlations. SIGMOD Conference 1997: 265-276 BibTeX
[6]
David Wai-Lok Cheung, Jiawei Han, Vincent T. Y. Ng, C. Y. Wong: Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique. ICDE 1996: 106-114 BibTeX
[7]
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization. SIGMOD Conference 1996: 13-23 BibTeX
[8]
Eui-Hong Han, George Karypis, Vipin Kumar: Scalable Parallel Data Mining for Association Rules. SIGMOD Conference 1997: 277-288 BibTeX
[9]
Jiawei Han, Yongjian Fu: Discovery of Multiple-Level Association Rules from Large Databases. VLDB 1995: 420-431 BibTeX
[10]
Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang: Online Aggregation. SIGMOD Conference 1997: 171-182 BibTeX
[11]
Tomasz Imielinski, Heikki Mannila: A Database Perspective on Knowledge Discovery. Commun. ACM 39(11): 58-64(1996) BibTeX
[12]
...
[13]
Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, A. Inkeri Verkamo: Finding Interesting Rules from Large Sets of Discovered Association Rules. CIKM 1994: 401-407 BibTeX
[14]
Brian Lent, Arun N. Swami, Jennifer Widom: Clustering Association Rules. ICDE 1997: 220-231 BibTeX
[15]
Rosa Meo, Giuseppe Psaila, Stefano Ceri: A New SQL-like Operator for Mining Association Rules. VLDB 1996: 122-133 BibTeX
[16]
Renée J. Miller, Yuping Yang: Association Rules over Interval Data. SIGMOD Conference 1997: 452-461 BibTeX
[17]
...
[18]
Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186 BibTeX
[19]
Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe: An Efficient Algorithm for Mining Association Rules in Large Databases. VLDB 1995: 432-444 BibTeX
[20]
Abraham Silberschatz, Stanley B. Zdonik: Database Systems - Breaking Out of the Box. SIGMOD Record 26(3): 36-50(1997) BibTeX
[21]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419 BibTeX
[22]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Quantitative Association Rules in Large Relational Tables. SIGMOD Conference 1996: 1-12 BibTeX
[23]
...
[24]
Hannu Toivonen: Sampling Large Databases for Association Rules. VLDB 1996: 134-145 BibTeX
[25]
...

Referenced by

  1. Anthony K. H. Tung, Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han: Constraint-based clustering in large databases. ICDT 2001: 405-419
  2. Themistoklis Palpanas: Knowledge Discovery in Data Warehouses. SIGMOD Record 29(3): 88-100(2000)
  3. Theodore Johnson, Laks V. S. Lakshmanan, Raymond T. Ng: The 3W Model and Algebra for Unified Data Mining. VLDB 2000: 21-32
  4. Jiawei Han, Jian Pei, Yiwen Yin: Mining Frequent Patterns without Candidate Generation. SIGMOD Conference 2000: 1-12
  5. Shinichi Morishita, Jun Sese: Traversing Itemset Lattice with Statistical Metric Pruning. PODS 2000: 226-236
  6. Minos N. Garofalakis, Rajeev Rastogi, S. Seshadri, Kyuseok Shim: Data Mining and the Web: Past, Present and Future. Workshop on Web Information and Data Management 1999: 43-47
  7. Edwin M. Knorr, Raymond T. Ng: Finding Intensional Knowledge of Distance-Based Outliers. VLDB 1999: 211-222
  8. Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim: SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. VLDB 1999: 223-234
  9. Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han, Teresa Mah: Exploratory Mining via Constrained Frequent Set Queries. SIGMOD Conference 1999: 556-558
  10. Laks V. S. Lakshmanan, Raymond T. Ng, Jiawei Han, Alex Pang: Optimization of Constrained Frequent Set Queries with 2-variable Constraints. SIGMOD Conference 1999: 157-168
  11. Kevin S. Beyer, Raghu Ramakrishnan: Bottom-Up Computation of Sparse and Iceberg CUBEs. SIGMOD Conference 1999: 359-370
  12. Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan: A Framework for Measuring Changes in Data Characteristics. PODS 1999: 126-137
  13. Jiawei Han, Guozhu Dong, Yiwen Yin: Efficient Mining of Partial Periodic Patterns in Time Series Database. ICDE 1999: 106-115
  14. Roberto J. Bayardo Jr., Rakesh Agrawal, Dimitrios Gunopulos: Constraint-Based Rule Mining in Large, Dense Databases. ICDE 1999: 188-197
  15. Jiawei Han: Towards On-Line Analytical Mining in Large Databases. SIGMOD Record 27(1): 97-107(1998)
  16. Tomasz Imielinski, Aashu Virmani: Association Rules... and What's Next? Towards Second Generation Data Mining Systems. ADBIS 1998: 6-25
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:41 2009