Efficient Construction of Regression Trees with Range and Region Splitting.
Yasuhiko Morimoto, Hiromu Ishii, Shinichi Morishita:
Efficient Construction of Regression Trees with Range and Region Splitting.
VLDB 1997: 166-175@inproceedings{DBLP:conf/vldb/MorimotoIM97,
author = {Yasuhiko Morimoto and
Hiromu Ishii and
Shinichi Morishita},
editor = {Matthias Jarke and
Michael J. Carey and
Klaus R. Dittrich and
Frederick H. Lochovsky and
Pericles Loucopoulos and
Manfred A. Jeusfeld},
title = {Efficient Construction of Regression Trees with Range and Region
Splitting},
booktitle = {VLDB'97, Proceedings of 23rd International Conference on Very
Large Data Bases, August 25-29, 1997, Athens, Greece},
publisher = {Morgan Kaufmann},
year = {1997},
isbn = {1-55860-470-7},
pages = {166-175},
ee = {db/conf/vldb/MorimotoIM97.html},
crossref = {DBLP:conf/vldb/97},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
We propose an efficient way of constructing regression trees in order to
predict the objective numeric attribute values of given tuples. A
regression tree is a rooted binary tree such that each internal node
contains a test, which can be expressed as an RDB query, for splitting
tuples into two disjoint classes and passing data in each class down to
the left or right subtree. The mean of the objective attribute values at
the leaf is used as the predicted value of the tuple.
To test a numeric attribute, traditional approaches use a guillotine-cut
splitting that classifies data into those below a given value and
others. Instead, we consider a family R of grid-regions in the
plane associated with two given numeric attributes. We propose to use a
test that splits data into those that lie inside a region R and those
that lie outside.
The contributions of this paper are as follows. We present an efficient
algorithm for computing R in R that minimizes the mean squared
error after the introduction of the test with the region R.
Experiments confirmed that the use of region splitting gives a smaller
mean squared error of regression trees. Our approach can also generate
smaller regression trees.
Copyright © 1997 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, Manfred A. Jeusfeld (Eds.):
VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece.
Morgan Kaufmann 1997, ISBN 1-55860-470-7
Contents BibTeX
Electronic Edition
From CS Dept.,
University Trier (Germany)
References
- [ACKT96]
- Tetsuo Asano, Danny Z. Chen, Naoki Katoh, Takeshi Tokuyama:
Polynomial-Time Solutions to Image Segmentation.
SODA 1996: 104-113 BibTeX
- [AIS93]
- Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami:
Mining Association Rules between Sets of Items in Large Databases.
SIGMOD Conference 1993: 207-216 BibTeX
- [AS94]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499 BibTeX
- [BFOS84]
- Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone:
Classification and Regression Trees.
Wadsworth 1984, ISBN 0-534-98053-8
BibTeX
- [FMMT96a]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Mining Optimized Association Rules for Numeric Attributes.
PODS 1996: 182-191 BibTeX
- [FMMT96b]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization.
SIGMOD Conference 1996: 13-23 BibTeX
- [FMMT96c]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules.
VLDB 1996: 146-155 BibTeX
- [HF95]
- Jiawei Han, Yongjian Fu:
Discovery of Multiple-Level Association Rules from Large Databases.
VLDB 1995: 420-431 BibTeX
- [MAR96]
- Manish Mehta, Rakesh Agrawal, Jorma Rissanen:
SLIQ: A Fast Scalable Classifier for Data Mining.
EDBT 1996: 18-32 BibTeX
- [PCY95]
- Jong Soo Park, Ming-Syan Chen, Philip S. Yu:
An Effective Hash Based Algorithm for Mining Association Rules.
SIGMOD Conference 1995: 175-186 BibTeX
- [PS91]
- Gregory Piatetsky-Shapiro, William J. Frawley (Eds.):
Knowledge Discovery in Databases.
AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents BibTeX
- [PSF91]
- Gregory Piatetsky-Shapiro:
Discovery, Analysis, and Presentation of Strong Rules.
Knowledge Discovery in Databases 1991: 229-248 BibTeX
- [Qui86]
- J. Ross Quinlan:
Induction of Decision Trees.
Machine Learning 1(1): 81-106(1986) BibTeX
- [Qui93]
- J. Ross Quinlan:
C4.5: Programs for Machine Learning.
Morgan Kaufmann 1993, ISBN 1-55860-238-0
BibTeX
- [SA96]
- Ramakrishnan Srikant, Rakesh Agrawal:
Mining Quantitative Association Rules in Large Relational Tables.
SIGMOD Conference 1996: 1-12 BibTeX
- [YFM+97]
- Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Computing Optimized Rectilinear Regions for Association Rules.
KDD 1997: 96-103 BibTeX
Referenced by
- Shinichi Morishita, Jun Sese:
Traversing Itemset Lattice with Statistical Metric Pruning.
PODS 2000: 226-236
- Takeshi Fukuda, Hirofumi Matsuzawa:
Parallel Processing of Multiple Aggregate Queries on Shared-Nothing Multiprocessors.
EDBT 1998: 278-292
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:15 2009