|




















|
|
 |
|
 |
Comparing Hierarchical Data in External Memory
|
Sudarshan S. Chawathe
View Paper (PDF)
Return to Changes and Temporal Data
We present an external-memory algorithm for computing a minimum-cost edit script between two rooted, ordered, labeled trees. The I/O, RAM, and CPU costs of our algorithm are, respectively,
4mn+7m+5n
,
6S
, and
O(MN+(M+N)S
1.5
)
, where
M
and
N
are the input tree sizes,
S
is the block size,
m=M/S
, and
n=N/S
. This algorithm can make effective use of surplus RAM capacity to quadratically reduce I/O cost. We extend to trees the commonly used mapping from sequence comparison problems to shortest-path problems in edit graphs.
Note: References link to DBLP on the Web.
-
[AHU76]
-
Alfred V. Aho
,
Daniel S. Hirschberg
,
Jeffrey D. Ullman
: Bounds on the Complexity of the Longest Common Subsequence Problem.
JACM 23(1)
: 1-12(1976)
-
[CAW98]
-
Sudarshan S. Chawathe
,
Serge Abiteboul
,
Jennifer Widom
: Representing and Querying Changes in Semistructured Data.
ICDE 1998
: 4-13
-
[CGM97]
-
Sudarshan S. Chawathe
,
Hector Garcia-Molina
: Meaningful Change Detection in Structured Data.
SIGMOD Conference 1997
: 26-37
-
[CRGMW96]
-
Sudarshan S. Chawathe
,
Anand Rajaraman
,
Hector Garcia-Molina
,
Jennifer Widom
: Change Detection in Hierarchically Structured Information.
SIGMOD Conf. 1996
: 493-504
-
[LGM96]
-
Wilburt Labio
,
Hector Garcia-Molina
: Efficient Snapshot Differential Algorithms for Data Warehousing.
VLDB 1996
: 63-74
-
[MM85]
-
Webb Miller
,
Eugene W. Myers
: A File Comparison Program.
SP&E 15(11)
: 1025-1040(1985)
-
[MP80]
-
William J. Masek
,
Michael S. Paterson
: A Faster Algorithm Computing String Edit Distances.
JCSS 20(1)
: 18-31(1980)
-
[Mye86]
-
Eugene W. Myers
: An O(ND) Difference Algorithm and Its Variations.
Algorithmica 1(2)
: 251-266(1986)
-
[Sel77]
-
Stanley M. Selkow
: The Tree-to-Tree Editing Problem.
IPL 6(6)
: 184-186(1977)
-
[SK83]
-
...
-
[Tic85]
-
Walter F. Tichy
: RCS - A System for Version Control.
SP&E 15(7)
: 637-654(1985)
-
[Vit98]
-
Jeffrey Scott Vitter
: External Memory Algorithms.
PODS 1998
: 119-128
-
[WC76]
-
C. K. Wong
,
Ashok K. Chandra
: Bounds for the String Editing Problem.
JACM 23(1)
: 13-16(1976)
-
[WCM+94]
-
Jason Tsong-Li Wang
,
Gung-Wei Chirn
,
Thomas G. Marr
,
Bruce A. Shapiro
,
Dennis Shasha
,
Kaizhong Zhang
: Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results.
SIGMOD Conference 1994
: 115-125
-
[WF74]
-
Robert A. Wagner
,
Michael J. Fischer
: The String-to-String Correction Problem.
JACM 21(1)
: 168-173(1974)
-
[WMG90]
-
Sun Wu
,
Udi Manber
,
Gene Myers
,
Webb Miller
: An O(NP) Sequence Comparison Algorithm.
IPL 35(6)
: 317-323(1990)
-
[WSC+97]
-
Jason Tsong-Li Wang
,
Dennis Shasha
,
George Jyh-Shian Chang
,
Liam Relihan
,
Kaizhong Zhang
,
Girish Patel
: Structural Matching and Discovery in Document Databases.
SIGMOD Conference 1997
: 560-563
-
[WZJS94]
-
Jason Tsong-Li Wang
,
Kaizhong Zhang
,
Karpjoo Jeong
,
Dennis Shasha
: A System for Approximate Tree Matching.
TKDE 6(4)
: 559-571(1994)
-
[Yan91]
-
Wuu Yang
: Identifying Syntactic differences Between Two Programs.
SP&E 21(7)
: 739-755(1991)
-
[ZS89]
-
Kaizhong Zhang
,
Dennis Shasha
: Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems.
SIAM J. Comput. 18(6)
: 1245-1262(1989)
-
[ZWS95]
-
...
@inproceedings{DBLP:conf/vldb/Chawathe99,
author = {Sudarshan S. Chawathe},
editor = {Malcolm P. Atkinson and
Maria E. Orlowska and
Patrick Valduriez and
Stanley B. Zdonik and
Michael L. Brodie},
title = {Comparing Hierarchical Data in External Memory},
booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
UK},
publisher = {Morgan Kaufmann},
year = {1999},
isbn = {1-55860-615-5},
pages = {90-101},
crossref = {DBLP:conf/vldb/99},
bibsource = {DBLP, http://dblp.uni-trier.de} } },
Copyright(C) 2000 ACM
|
|
|
|
|
|
|