Dynamic Itemset Counting and Implication Rules for Market Basket Data.

Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur: Dynamic Itemset Counting and Implication Rules for Market Basket Data. SIGMOD Conference 1997: 255-264

@inproceedings{DBLP:conf/sigmod/BrinMUT97,
  author    = {Sergey Brin and
               Rajeev Motwani and
               Jeffrey D. Ullman and
               Shalom Tsur},
  editor    = {Joan Peckham},
  title     = {Dynamic Itemset Counting and Implication Rules for Market Basket
               Data},
  booktitle = {SIGMOD 1997, Proceedings ACM SIGMOD International Conference
               on Management of Data, May 13-15, 1997, Tucson, Arizona, USA},
  publisher = {ACM Press},
  year      = {1997},
  pages     = {255-264},
  ee        = {http://doi.acm.org/10.1145/253260.253325, db/conf/sigmod/BrinMUT97.html},
  crossref  = {DBLP:conf/sigmod/97},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

BibTeX

Abstract

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling. We investigate the idea of item reordering, which can improve the low-level efficiency of the algorithm. Second, we present a new way of generating "implication rules," which are normalized based on both the antecedent and the consequent and are truly implications (not simply a measure of co-occurrence), and we show how they produce more intuitive results than other methods. Finally, we show how different characteristics of real data, as opposed to synthetic data, can dramatically affect the performance of the system and the form of the results.

Copyright © 1997 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.

ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/smod9397 or /cdrom

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

BibTeX

Printed Edition

Joan Peckham (Ed.): SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. ACM Press 1997 BibTeX , SIGMOD Record 26(2), June 1997
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1164 KB]

References

[AIS93a]: Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Database Mining: A Performance Perspective. IEEE Trans. Knowl. Data Eng. 5(6): 914-925(1993) BibTeX
[AIS93b]: Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
[ALSS95]: Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, Kyuseok Shim: Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. VLDB 1995: 490-501 BibTeX
[AS94]: Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[AS95]: Rakesh Agrawal, Ramakrishnan Srikant: Mining Sequential Patterns. ICDE 1995: 3-14 BibTeX
[MAR96]: Manish Mehta, Rakesh Agrawal, Jorma Rissanen: SLIQ: A Fast Scalable Classifier for Data Mining. EDBT 1996: 18-32 BibTeX
[SA95]: Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419 BibTeX
[Toi96]: Hannu Toivonen: Sampling Large Databases for Association Rules. VLDB 1996: 134-145 BibTeX

Referenced by

David Gibson, Jon M. Kleinberg, Prabhakar Raghavan: Clustering Categorical Data: An Approach Based on Dynamical Systems. VLDB J. 8(3-4): 222-236(2000)
Themistoklis Palpanas: Knowledge Discovery in Data Warehouses. SIGMOD Record 29(3): 88-100(2000)
Ke Wang, Yu He, Jiawei Han: Mining Frequent Itemsets Using Support Constraints. VLDB 2000: 43-52
Shinichi Morishita, Jun Sese: Traversing Itemset Lattice with Statistical Metric Pruning. PODS 2000: 226-236
H. V. Jagadish, J. Madar, Raymond T. Ng: Semantic Compression and Pattern Extraction with Fascicles. VLDB 1999: 186-198
Christian Hidber: Online Association Rule Mining. SIGMOD Conference 1999: 145-156
Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan: A Framework for Measuring Changes in Data Characteristics. PODS 1999: 126-137
Nicolas Pasquier, Yves Bastide, Rafik Taouil, Lotfi Lakhal: Discovering Frequent Closed Itemsets for Association Rules. ICDT 1999: 398-416
Roberto J. Bayardo Jr., Rakesh Agrawal, Dimitrios Gunopulos: Constraint-Based Rule Mining in Large, Dense Databases. ICDE 1999: 188-197
Suh-Ying Wur, Yungho Leu: An Effective Boolean Algorithm for Mining Association Rules in Large Databases. DASFAA 1999: 179-186
Charu C. Aggarwal, Philip S. Yu: Mining Large Itemsets for Association Rules. IEEE Data Eng. Bull. 21(1): 23-31(1998)
David Gibson, Jon M. Kleinberg, Prabhakar Raghavan: Clustering Categorical Data: An Approach Based on Dynamical Systems. VLDB 1998: 311-322
Min Fang, Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani, Jeffrey D. Ullman: Computing Iceberg Queries Efficiently. VLDB 1998: 299-310
Sunita Sarawagi, Shiby Thomas, Rakesh Agrawal: Integrating Mining with Relational Database Systems: Alternatives and Implications. SIGMOD Conference 1998: 343-354
Phillip B. Gibbons, Yossi Matias: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998: 331-342
Roberto J. Bayardo Jr.: Efficiently Mining Long Patterns from Databases. SIGMOD Conference 1998: 85-93
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. SIGMOD Conference 1998: 94-105
Charu C. Aggarwal, Philip S. Yu: A New Framework For Itemset Generation. PODS 1998: 18-24
Dao-I Lin, Zvi M. Kedem: Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set. EDBT 1998: 105-119
Tomasz Imielinski, Aashu Virmani: Association Rules... and What's Next? Towards Second Generation Data Mining Systems. ADBIS 1998: 6-25

BibTeX

ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]