Data mining, Hypergraph Transversals, and Machine Learning.

Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen: Data mining, Hypergraph Transversals, and Machine Learning. PODS 1997: 209-216
  author    = {Dimitrios Gunopulos and
               Roni Khardon and
               Heikki Mannila and
               Hannu Toivonen},
  title     = {Data mining, Hypergraph Transversals, and Machine Learning},
  booktitle = {Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, May 12-14, 1997, Tucson, Arizona},
  publisher = {ACM Press},
  year      = {1997},
  isbn      = {0-89791-910-6},
  pages     = {209-216},
  ee        = {, db/conf/pods/GunopulosKMT97.html},
  crossref  = {DBLP:conf/pods/97},
  bibsource = {DBLP,}


Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with the hypergraph transversal problem. We then analyze two algorithms that have been previously used in data mining, proving upper bounds on their complexity. The first algorithm is useful when the maximally specific interesting sentences are "small". We show that this algorithm can also be used to efficiently solve a special case of the hypergraph transversal problem, improving on previous results. The second algorithm utilizes a subroutine for hypergraph transversals, and is applicable in more general situations, with complexity close to a lower bound for the problem. We also relate these problems to the model of exact learning in computational learning theory, and use the correspondence to derive some corollaries.

Copyright © 1997 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.

Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ... BibTeX

Printed Edition

Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 12-14, 1997, Tucson, Arizona. ACM Press 1997, ISBN 0-89791-910-6
Contents BibTeX

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1466 KB]


Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
Nader H. Bshouty, Richard Cleve, Ricard Gavaldà, Sampath Kannan, Christino Tamon: Oracles and Queries That Are Sufficient for Exact Learning. J. Comput. Syst. Sci. 52(3): 421-433(1996) BibTeX
Luc De Raedt, Saso Dzeroski: First-Order jk-Clausal Theories are PAC-Learnable. Artif. Intell. 70(1-2): 375-392(1994) BibTeX
Thomas Eiter, Georg Gottlob: Identifying the Minimal Transversals of a Hypergraph and Related Problems. SIAM J. Comput. 24(6): 1278-1304(1995) BibTeX
Michael L. Fredman, Leonid Khachiyan: On the Complexity of Dualization of Monotone Disjunctive Normal Forms. J. Algorithms 21(3): 618-628(1996) BibTeX
Dimitrios Gunopulos, Heikki Mannila, Sanjeev Saluja: Discovering All Most Specific Sentences by Randomized Algorithms. ICDT 1997: 215-229 BibTeX
Roni Khardon: Translating between Horn Representations and their Characteristic Models. J. Artif. Intell. Res. (JAIR) 3: 349-372(1995) BibTeX
Willi Klösgen: Efficient Discovery of Interesting Statements in Databases. J. Intell. Inf. Syst. 4(1): 53-69(1995) BibTeX
Heikki Mannila, Kari-Jouko Räihä: Design by Example: An Application of Armstrong Relations. J. Comput. Syst. Sci. 33(2): 126-141(1986) BibTeX
Heikki Mannila, Kari-Jouko Räihä: Algorithms for Inferring Functional Dependencies from Relations. Data Knowl. Eng. 12(1): 83-99(1994) BibTeX
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo: Efficient Algorithms for Discovering Association Rules. KDD Workshop 1994: 181-192 BibTeX
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo: Discovering Frequent Episodes in Sequences. KDD 1995: 210-215 BibTeX

Referenced by

  1. Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. SIGMOD Conference 1998: 94-105
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:34:17 2009