ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Fast Algorithms for Mining Association Rules in Large Databases.

Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499
@inproceedings{DBLP:conf/vldb/AgrawalS94,
  author    = {Rakesh Agrawal and
               Ramakrishnan Srikant},
  editor    = {Jorge B. Bocca and
               Matthias Jarke and
               Carlo Zaniolo},
  title     = {Fast Algorithms for Mining Association Rules in Large Databases},
  booktitle = {VLDB'94, Proceedings of 20th International Conference on Very
               Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile},
  publisher = {Morgan Kaufmann},
  year      = {1994},
  isbn      = {1-55860-153-8},
  pages     = {487-499},
  ee        = {db/conf/vldb/vldb94-487.html},
  crossref  = {DBLP:conf/vldb/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving this problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also show how the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database.

Copyright © 1994 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Jorge B. Bocca, Matthias Jarke, Carlo Zaniolo (Eds.): VLDB'94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile. Morgan Kaufmann 1994, ISBN 1-55860-153-8
Contents BibTeX

References

[1]
Rakesh Agrawal, Christos Faloutsos, Arun N. Swami: Efficient Similarity Search In Sequence Databases. FODO 1993: 69-84 BibTeX
[2]
Rakesh Agrawal, Sakti P. Ghosh, Tomasz Imielinski, Balakrishna R. Iyer, Arun N. Swami: An Interval Classifier for Database Mining Applications. VLDB 1992: 560-573 BibTeX
[3]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Database Mining: A Performance Perspective. IEEE Trans. Knowl. Data Eng. 5(6): 914-925(1993) BibTeX
[4]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
[5]
...
[6]
...
[7]
Ronald J. Brachman, Peter G. Selfridge, Loren G. Terveen, Boris Altman, Fern Halper, Thomas Kirk, Alan Lazar, Deborah L. McGuinness, Lori Alperin Resnick: Integrated Support for Data Archeology. Int. J. Cooperative Inf. Syst. 2(2): 159-185(1993) BibTeX
[8]
Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone: Classification and Regression Trees. Wadsworth 1984, ISBN 0-534-98053-8
BibTeX
[9]
...
[10]
Douglas H. Fisher: Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning 2(2): 139-172(1987) BibTeX
[11]
Jiawei Han, Yandong Cai, Nick Cercone: Knowledge Discovery in Databases: An Attribute-Oriented Approach. VLDB 1992: 547-559 BibTeX
[12]
...
[13]
Maurice A. W. Houtsma, Arun N. Swami: Set-Oriented Mining for Association Rules in Relational Databases. ICDE 1995: 25-33 BibTeX
[14]
Ravi Krishnamurthy, Tomasz Imielinski: Research Directions in Knowledge Discovery. SIGMOD Record 20(3): 76-78(1991) BibTeX
[15]
...
[16]
Heikki Mannila, Kari-Jouko Räihä: Dependency Inference. VLDB 1987: 155-158 BibTeX
[17]
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo: Efficient Algorithms for Discovering Association Rules. KDD Workshop 1994: 181-192 BibTeX
[18]
...
[19]
...
[20]
Gregory Piatetsky-Shapiro: Discovery, Analysis, and Presentation of Strong Rules. Knowledge Discovery in Databases 1991: 229-248 BibTeX
[21]
Gregory Piatetsky-Shapiro, William J. Frawley (Eds.): Knowledge Discovery in Databases. AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents BibTeX
[22]
J. Ross Quinlan: C4.5: Programs for Machine Learning. Morgan Kaufmann 1993, ISBN 1-55860-238-0
BibTeX

Referenced by

  1. Noel Novelli, Rosine Cicchetti: FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies. ICDT 2001: 189-203
  2. Toon Calders, Jan Paredaens: Axiomatization of Frequent Sets. ICDT 2001: 204-218
  3. Flip Korn, Alexandros Labrinidis, Yannis Kotidis, Christos Faloutsos: Quantifiable Data Mining Using Ratio Rules. VLDB J. 8(3-4): 254-266(2000)
  4. David Gibson, Jon M. Kleinberg, Prabhakar Raghavan: Clustering Categorical Data: An Approach Based on Dynamical Systems. VLDB J. 8(3-4): 222-236(2000)
  5. Themistoklis Palpanas: Knowledge Discovery in Data Warehouses. SIGMOD Record 29(3): 88-100(2000)
  6. Haixun Wang, Carlo Zaniolo: Using SQL to Build New Aggregates and Extenders for Object- Relational Systems. VLDB 2000: 166-175
  7. Ke Wang, Yu He, Jiawei Han: Mining Frequent Itemsets Using Support Constraints. VLDB 2000: 43-52
  8. Theodore Johnson, Laks V. S. Lakshmanan, Raymond T. Ng: The 3W Model and Algebra for Unified Data Mining. VLDB 2000: 21-32
  9. Pradeep Shenoy, Jayant R. Haritsa, S. Sudarshan, Gaurav Bhalotia, Mayank Bawa, Devavrat Shah: Turbo-charging Vertical Mining of Large Databases. SIGMOD Conference 2000: 22-33
  10. Jiawei Han, Jian Pei, Yiwen Yin: Mining Frequent Patterns without Candidate Generation. SIGMOD Conference 2000: 1-12
  11. Shinichi Morishita, Jun Sese: Traversing Itemset Lattice with Statistical Metric Pruning. PODS 2000: 226-236
  12. Stéphane Lopes, Jean-Marc Petit, Lotfi Lakhal: Efficient Discovery of Functional Dependencies and Armstrong Relations. EDBT 2000: 350-364
  13. Karuna P. Joshi, Anupam Joshi, Yelena Yesha, Raghu Krishnapuram: Warehousing and Mining Web Logs. Workshop on Web Information and Data Management 1999: 63-68
  14. Minos N. Garofalakis, Rajeev Rastogi, S. Seshadri, Kyuseok Shim: Data Mining and the Web: Past, Present and Future. Workshop on Web Information and Data Management 1999: 43-47
  15. Ke Wang, Senqiang Zhou, Shiang Chen Liew: Building Hierarchical Classifiers Using Class Proximity. VLDB 1999: 363-374
  16. Masahisa Tamura, Masaru Kitsuregawa: Dynamic Load Balancing for Parallel Association Rule Mining on Heterogenous PC Cluster Systems. VLDB 1999: 162-173
  17. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins: Extracting Large-Scale Knowledge Bases from the Web. VLDB 1999: 639-650
  18. H. V. Jagadish, J. Madar, Raymond T. Ng: Semantic Compression and Pattern Extraction with Fascicles. VLDB 1999: 186-198
  19. Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim: SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. VLDB 1999: 223-234
  20. Wen-Chi Hou: A Framework for Statistical Data Mining with Summary Tables. SSDBM 1999: 14-23
  21. Laks V. S. Lakshmanan, Raymond T. Ng, Jiawei Han, Alex Pang: Optimization of Constrained Frequent Set Queries with 2-variable Constraints. SIGMOD Conference 1999: 157-168
  22. Christian Hidber: Online Association Rule Mining. SIGMOD Conference 1999: 145-156
  23. Kevin S. Beyer, Raghu Ramakrishnan: Bottom-Up Computation of Sparse and Iceberg CUBEs. SIGMOD Conference 1999: 359-370
  24. Charu C. Aggarwal, Joel L. Wolf, Philip S. Yu: A New Method for Similarity Indexing of Market Basket Data. SIGMOD Conference 1999: 407-418
  25. Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan: A Framework for Measuring Changes in Data Characteristics. PODS 1999: 126-137
  26. Nicolas Pasquier, Yves Bastide, Rafik Taouil, Lotfi Lakhal: Discovering Frequent Closed Itemsets for Association Rules. ICDT 1999: 398-416
  27. Rajeev Rastogi, Kyuseok Shim: Mining Optimized Support Rules for Numeric Attributes. ICDE 1999: 206-215
  28. Jiawei Han, Guozhu Dong, Yiwen Yin: Efficient Mining of Partial Periodic Patterns in Time Series Database. ICDE 1999: 106-115
  29. Brian Dunkel, Nandit Soparkar: Data Organization and Access for Efficient Data Mining. ICDE 1999: 522-529
  30. Mathias Géry, M. Hatem Haddad: Knowledge Discovery for Automatic Query Expansion on the World-Wide Web. ER (Workshops) 1999: 334-347
  31. Philip S. Yu: Data Mining and Personalization Technologies. DASFAA 1999: 6-13
  32. Suh-Ying Wur, Yungho Leu: An Effective Boolean Algorithm for Mining Association Rules in Large Databases. DASFAA 1999: 179-186
  33. Gunter Saake, Andreas Heuer: Datenbanken: Implementierungstechniken. MITP-Verlag 1999, ISBN 3-8266-0513-6
    Contents
  34. Ming-Syan Chen, Jong Soo Park, Philip S. Yu: Efficient Data Mining for Path Traversal Patterns. IEEE Trans. Knowl. Data Eng. 10(2): 209-221(1998)
  35. Colin L. Carter, Howard J. Hamilton: Efficient Attribute-Oriented Generalization for Knowledge Discovery from Large Databases. IEEE Trans. Knowl. Data Eng. 10(2): 193-208(1998)
  36. Chan Man Kuok, Ada Wai-Chee Fu, Man Hon Wong: Mining Fuzzy Association Rules in Databases. SIGMOD Record 27(1): 41-46(1998)
  37. Jiawei Han: Towards On-Line Analytical Mining in Large Databases. SIGMOD Record 27(1): 97-107(1998)
  38. G. D. Ramkumar, Arun N. Swami: Clustering Data Without Distance Functions. IEEE Data Eng. Bull. 21(1): 9-14(1998)
  39. Eui-Hong Han, George Karypis, Vipin Kumar, Bamshad Mobasher: Hypergraph Based Clustering in High-Dimensional Data Sets: A Summary of Results. IEEE Data Eng. Bull. 21(1): 15-22(1998)
  40. Charu C. Aggarwal, Philip S. Yu: Mining Large Itemsets for Association Rules. IEEE Data Eng. Bull. 21(1): 23-31(1998)
  41. Erik Riedel, Garth A. Gibson, Christos Faloutsos: Active Storage for Large-Scale Data Mining and Multimedia. VLDB 1998: 62-73
  42. Sridhar Ramaswamy, Sameer Mahajan, Abraham Silberschatz: On the Discovery of Interesting Patterns in Association Rules. VLDB 1998: 368-379
  43. Yasuhiko Morimoto, Takeshi Fukuda, Hirofumi Matsuzawa, Takeshi Tokuyama, Kunikazu Yoda: Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. VLDB 1998: 380-391
  44. Flip Korn, Alexandros Labrinidis, Yannis Kotidis, Christos Faloutsos: Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining. VLDB 1998: 582-593
  45. Min Fang, Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani, Jeffrey D. Ullman: Computing Iceberg Queries Efficiently. VLDB 1998: 299-310
  46. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Michael Wimmer, Xiaowei Xu: Incremental Clustering for Mining in a Data Warehousing Environment. VLDB 1998: 323-333
  47. Peter A. Boncz, Tim Rühl, Fred Kwakkel: The Drill Down Benchmark. VLDB 1998: 628-632
  48. Shalom Tsur, Jeffrey D. Ullman, Serge Abiteboul, Chris Clifton, Rajeev Motwani, Svetlozar Nestorov, Arnon Rosenthal: Query Flocks: A Generalization of Association-Rule Mining. SIGMOD Conference 1998: 1-12
  49. Takahiko Shintani, Masaru Kitsuregawa: Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy. SIGMOD Conference 1998: 25-36
  50. KianSing Ng, Huan Liu, HweeBong Kwah: A Data Mining Application: Customes Retention at the Port of Singapore Authority (PSA). SIGMOD Conference 1998: 522-525
  51. Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han, Alex Pang: Exploratory Mining and Pruning Optimizations of Constrained Association Rules. SIGMOD Conference 1998: 13-24
  52. Phillip B. Gibbons, Yossi Matias: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998: 331-342
  53. Charu C. Aggarwal, Philip S. Yu: A New Framework For Itemset Generation. PODS 1998: 18-24
  54. Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe: Mining for Strong Negative Associations in a Large Database of Customer Transactions. ICDE 1998: 494-502
  55. Rajeev Rastogi, Kyuseok Shim: Mining Optimized Association Rules with Categorical and Numeric Attributes. ICDE 1998: 503-512
  56. Banu Özden, Sridhar Ramaswamy, Abraham Silberschatz: Cyclic Association Rules. ICDE 1998: 412-421
  57. Rosa Meo, Giuseppe Psaila, Stefano Ceri: A Tightly-Coupled Architecture for Data Mining. ICDE 1998: 316-323
  58. Jun-Lin Lin, Margaret H. Dunham: Mining Association Rules: Anti-Skew Algorithms. ICDE 1998: 486-493
  59. Charu C. Aggarwal, Philip S. Yu: Online Generation of Association Rules. ICDE 1998: 402-411
  60. Chien-Le Goh, Masahiko Tsukamoto, Shojiro Nishio: Fast Methods with Magic Sampling for Knowledge Discovery in Deductive Databases with Large Deduction Results. ER Workshops 1998: 14-28
  61. Dao-I Lin, Zvi M. Kedem: Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set. EDBT 1998: 105-119
  62. Ling Feng, Hongjun Lu, Y. C. Tay, Anthony K. H. Tung: Buffer Management in Distributed Database Systems: A Data Mining Based Approach. EDBT 1998: 246-260
  63. Martin Ester, Rüdiger Wittmann: Incremental Generalization for Mining in a Data Warehousing Environment. EDBT 1998: 135-149
  64. Marek Wojciechowski, Maciej Zakrzewicz: Itemset Materializing for Fast Mining of Association Rules. ADBIS 1998: 284-295
  65. Tomasz Imielinski, Aashu Virmani: Association Rules... and What's Next? Towards Second Generation Data Mining Systems. ADBIS 1998: 6-25
  66. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Trans. Knowl. Data Eng. 9(5): 813-825(1997)
  67. Joseph M. Hellerstein: Online Processing Redux. IEEE Data Eng. Bull. 20(3): 20-29(1997)
  68. Yasuhiko Morimoto, Hiromu Ishii, Shinichi Morishita: Efficient Construction of Regression Trees with Range and Region Splitting. VLDB 1997: 166-175
  69. Renée J. Miller, Yuping Yang: Association Rules over Interval Data. SIGMOD Conference 1997: 452-461
  70. Jiawei Han, Krzysztof Koperski, Nebojsa Stefanovic: GeoMiner: A System Prototype for Spatial Data Mining. SIGMOD Conference 1997: 553-556
  71. Eui-Hong Han, George Karypis, Vipin Kumar: Scalable Parallel Data Mining for Association Rules. SIGMOD Conference 1997: 277-288
  72. Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur: Dynamic Itemset Counting and Implication Rules for Market Basket Data. SIGMOD Conference 1997: 255-264
  73. Sergey Brin, Rajeev Motwani, Craig Silverstein: Beyond Market Baskets: Generalizing Association Rules to Correlations. SIGMOD Conference 1997: 265-276
  74. David Wai-Lok Cheung, Sau Dan Lee, Ben Kao: A General Incremental Technique for Maintaining Discovered Association Rules. DASFAA 1997: 185-194
  75. Tadeusz Morzy, Maciej Zakrzewicz: SQL-Like Language for Database Mining. ADBIS 1997: 311-317
  76. Daniel A. Keim, Hans-Peter Kriegel: Visualization Techniques for Mining Large Databases: A Comparison. IEEE Trans. Knowl. Data Eng. 8(6): 923-938(1996)
  77. Chien-Le Goh, Masahiko Tsukamoto, Shojiro Nishio: Knowledge Discovery in Deductive Databases with Large Deduction Results: the First Step. IEEE Trans. Knowl. Data Eng. 8(6): 952-956(1996)
  78. David Wai-Lok Cheung, Vincent T. Y. Ng, Ada Wai-Chee Fu, Yongjian Fu: Efficient Mining of Association Rules in Distributed Databases. IEEE Trans. Knowl. Data Eng. 8(6): 911-922(1996)
  79. Ming-Syan Chen, Jiawei Han, Philip S. Yu: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6): 866-883(1996)
  80. Rakesh Agrawal, John C. Shafer: Parallel Mining of Association Rules. IEEE Trans. Knowl. Data Eng. 8(6): 962-969(1996)
  81. Marisa S. Viveros, John P. Nearhos, Michael J. Rothman: Applying Data Mining Techniques to a Health Insurance Information System. VLDB 1996: 286-294
  82. Hannu Toivonen: Sampling Large Databases for Association Rules. VLDB 1996: 134-145
  83. Rosa Meo, Giuseppe Psaila, Stefano Ceri: A New SQL-like Operator for Mining Association Rules. VLDB 1996: 122-133
  84. Ramakrishnan Srikant, Rakesh Agrawal: Mining Quantitative Association Rules in Large Relational Tables. SIGMOD Conference 1996: 1-12
  85. Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization. SIGMOD Conference 1996: 13-23
  86. Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Mining Optimized Association Rules for Numeric Attributes. PODS 1996: 182-191
  87. Kimmo Hätönen, Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen: Knowledge Discovery from Telecommunication Network Alarm Databases. ICDE 1996: 115-122
  88. David Wai-Lok Cheung, Jiawei Han, Vincent T. Y. Ng, C. Y. Wong: Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique. ICDE 1996: 106-114
  89. Ramakrishnan Srikant, Rakesh Agrawal: Mining Sequential Patterns: Generalizations and Performance Improvements. EDBT 1996: 3-17
  90. Nick Koudas, Christos Faloutsos, Ibrahim Kamel: Declustering Spatial Databases on a Multi-Computer Architecture. EDBT 1996: 592-614
  91. Christos Faloutsos: Fast Searching by Content in Multimedia Databases. IEEE Data Eng. Bull. 18(4): 31-40(1995)
  92. Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419
  93. Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe: An Efficient Algorithm for Mining Association Rules in Large Databases. VLDB 1995: 432-444
  94. Jiawei Han, Yongjian Fu: Discovery of Multiple-Level Association Rules from Large Databases. VLDB 1995: 420-431
  95. Alberto Belussi, Christos Faloutsos: Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension. VLDB 1995: 299-310
  96. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186
  97. Christos Faloutsos, King-Ip Lin: FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets. SIGMOD Conference 1995: 163-174
  98. Rakesh Agrawal, Ramakrishnan Srikant: Mining Sequential Patterns. ICDE 1995: 3-14
  99. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: Efficient Parallel and Data Mining for Association Rules. CIKM 1995: 31-36
  100. Jiawei Han: Mining Knowledge at Multiple Concept Levels. CIKM 1995: 19-24
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:45:59 2009