|




















|
|
 |
|
 |
|
A New Method for Similarity Indexing of Market Basket Data
|
Charu C. Aggarwal,
Joel L. Wolf, and
Philip S. Yu
View Paper (PDF)
Return to Similarity Search
In recent years, many data mining methods have been proposed for finding useful and structured information from market basket data. The association rule model was recently proposed in order to discover useful patterns and dependencies in such data. This paper discusses a method for indexing market basket data efficiently for similarity search. The technique is likely to be very useful in applications which utilize the similarity in customer buying behavior in order to make peer recommendations. We propose an index called the signature table, which is very exible in supporting a wide range of similarity functions. The construction of the index structure is independent of the similarity function, which can be specified at query time. The resulting similarity search algorithm shows excellent scalability with increasing memory availability and database size.
Note: References link to DBLP on the Web.
-
[1]
-
Charu C. Aggarwal
,
Joel L. Wolf
,
Philip S. Yu
,
Marina Epelman
: The S-Tree: An Efficient Index for Multidimensional Objects.
SSD 1997
: 350-373
-
[2]
-
Rakesh Agrawal
,
Tomasz Imielinski
,
Arun N. Swami
: Mining Association Rules between Sets of Items in Large Databases.
SIGMOD Conference 1993
: 207-216
-
[3]
-
Rakesh Agrawal
,
Ramakrishnan Srikant
: Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994
: 487-499
-
[4]
-
...
-
[5]
-
Norbert Beckmann
,
Hans-Peter Kriegel
,
Ralf Schneider
,
Bernhard Seeger
: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.
SIGMOD Conference 1990
: 322-331
-
[6]
-
Stefan Berchtold
,
Christian Böhm
,
Hans-Peter Kriegel
: The Pyramid-Tree: Breaking the Curse of Dimensionality.
SIGMOD Conference 1998
: 142-153
-
[7]
-
Stefan Berchtold
,
Bernhard Ertl
,
Daniel A. Keim
,
Hans-Peter Kriegel
,
Thomas Seidl
: Fast Nearest Neighbor Search in High-Dimensional Space.
ICDE 1998
: 209-218
-
[8]
-
...
-
[9]
-
Christos Faloutsos
,
Raphael Chan
: Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison.
VLDB 1988
: 280-293
-
[10]
-
Christos Faloutsos
,
Stavros Christodoulakis
: Description and Performance Analysis of Signature File Methods for Office Filing.
TOIS 5(3)
: 237-257(1987)
-
[11]
-
William B. Frakes
,
Ricardo A. Baeza-Yates
(Eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall 1992, ISBN 0-13-463837-9
Contents
-
[12]
-
Antonin Guttman
: R-Trees: A Dynamic Index Structure for Spatial Searching.
SIGMOD Conference 1984
: 47-57
-
[13]
-
Klaus Hinrichs
,
Jürg Nievergelt
: The Grid File: A Data Structure to Support Proximity Queries on Spatial Objects.
WG 1983
: 100-113
-
[14]
-
David A. White
,
Ramesh Jain
: Similarity Indexing: Algorithms and Performance.
Storage and Retrieval for Image and Video Databases (SPIE) 1996
: 62-73
-
[15]
-
Norio Katayama
,
Shin'ichi Satoh
: The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries.
SIGMOD Conference 1997
: 369-380
-
[16]
-
King-Ip Lin
,
H. V. Jagadish
,
Christos Faloutsos
: The TV-Tree: An Index Structure for High-Dimensional Data.
VLDB Journal 3(4)
: 517-542(1994)
-
[17]
-
Nick Roussopoulos
,
Stephen Kelley
,
Frédéic Vincent
: Nearest Neighbor Queries.
SIGMOD Conference 1995
: 71-79
-
[18]
-
Gerard Salton
: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley 1989, ISBN 0-201-12227-8
-
[19]
-
R. Sibson
: SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method.
The Computer Journal 16(1)
: 30-34(1973)
-
[20]
-
Thomas Seidl
,
Hans-Peter Kriegel
: Optimal Multi-Step k-Nearest Neighbor Search.
SIGMOD Conference 1998
: 154-165
-
[21]
-
Thomas Seidl
,
Hans-Peter Kriegel
: Efficient User-Adaptable Similarity Search in Large Multimedia Databases.
VLDB 1997
: 506-515
-
[22]
-
Timos K. Sellis
,
Nick Roussopoulos
,
Christos Faloutsos
: The R+-Tree: A Dynamic Index for Multi-Dimensional Objects.
VLDB 1987
: 507-518
-
[23]
-
David A. White
,
Ramesh Jain
: Similarity Indexing with the SS-tree.
ICDE 1996
: 516-523
@inproceedings{DBLP:conf/sigmod/AggarwalWY99,
author = {Charu C. Aggarwal and
Joel L. Wolf and
Philip S. Yu},
editor = {Alex Delis and
Christos Faloutsos and
Shahram Ghandeharizadeh},
title = {A New Method for Similarity Indexing of Market Basket Data},
booktitle = {SIGMOD 1999, Proceedings ACM SIGMOD International Conference
on Management of Data, June 1-3, 1999, Philadephia, Pennsylvania,
USA},
publisher = {ACM Press},
year = {1999},
isbn = {1-58113-084-8},
pages = {407-418},
crossref = {DBLP:conf/sigmod/99},
bibsource = {DBLP, http://dblp.uni-trier.de} } },
Copyright(C) 2000 ACM
|
|
|
|
|
|
|