Mining Deviants in a Time Series Database.
H. V. Jagadish, Nick Koudas, S. Muthukrishnan:
Mining Deviants in a Time Series Database.
VLDB 1999: 102-113@inproceedings{DBLP:conf/vldb/KoudasMJ99,
author = {H. V. Jagadish and
Nick Koudas and
S. Muthukrishnan},
editor = {Malcolm P. Atkinson and
Maria E. Orlowska and
Patrick Valduriez and
Stanley B. Zdonik and
Michael L. Brodie},
title = {Mining Deviants in a Time Series Database},
booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
UK},
publisher = {Morgan Kaufmann},
year = {1999},
isbn = {1-55860-615-7},
pages = {102-113},
ee = {db/conf/vldb/KoudasMJ99.html},
crossref = {DBLP:conf/vldb/99},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Identifiying outliers is an important data analysis
function. Statisticans have long studied techniques
to identify outliers is a data set in the context
of fitting the data to some model. In the case
of time series data, the situation is more murky.
For instance, the ``typical'' value cound ``drift''
up or down over time, so the extrema may not necessarily
be interesting. We wish to identify data points that are
somehow anomalous or ``surprising''.
We formally define the notion of a deviant in a time
series, based on a representation sparsity metric.
We develop an efficient algorithm to identify devinats
is a time series. We demonstrate how this technique can
be used to locate interesting artifacts in time series
data, and present experimental evidence of the value
of our technique.
As a side benefit, our algorithm are able to produce
histogram representations of data, that
have substantially lower error than ``optimal histograms''
for the same total storage, including both
histogram buckets and the deviants stored separately.
This is of independent interest for selectivity estimation.
Copyright © 1999 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (Eds.):
VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK.
Morgan Kaufmann 1999, ISBN 1-55860-615-7
Contents BibTeX
References
- [AAR95]
- Andreas Arning, Rakesh Agrawal, Prabhakar Raghavan:
A Linear Method for Deviation Detection in Large Databases.
KDD 1996: 164-169 BibTeX
- [Bel54]
- ...
- [Cha84]
- ...
- [GMP97]
- Phillip B. Gibbons, Yossi Matias, Viswanath Poosala:
Fast Incremental Maintenance of Approximate Histograms.
VLDB 1997: 466-475 BibTeX
- [HDY99]
- Jiawei Han, Guozhu Dong, Yiwen Yin:
Efficient Mining of Partial Periodic Patterns in Time Series Database.
ICDE 1999: 106-115 BibTeX
- [Ioa93]
- Yannis E. Ioannidis:
Universality of Serial Histograms.
VLDB 1993: 256-267 BibTeX
- [IP95]
- Yannis E. Ioannidis, Viswanath Poosala:
Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
SIGMOD Conference 1995: 233-244 BibTeX
- [JKM+98]
- H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Viswanath Poosala, Kenneth C. Sevcik, Torsten Suel:
Optimal Histograms with Quality Guarantees.
VLDB 1998: 275-286 BibTeX
- [KN98]
- Edwin M. Knorr, Raymond T. Ng:
Algorithms for Mining Distance-Based Outliers in Large Datasets.
VLDB 1998: 392-403 BibTeX
- [PI97]
- Viswanath Poosala, Yannis E. Ioannidis:
Selectivity Estimation Without the Attribute Value Independence Assumption.
VLDB 1997: 486-495 BibTeX
- [PIHS96]
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305 BibTeX
Referenced by
- Themistoklis Palpanas:
Knowledge Discovery in Data Warehouses.
SIGMOD Record 29(3): 88-100(2000)
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:25 2009