Digital Symposium Collection 2000  

 
 
 
 
 
 

 





















Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F

Arnaud Sahuguet and Fabien Azavant

  View Paper (PDF)     View Demo (HTML)  

Return to Demonstrations

Abstract
The Web has become a major conduit to information repositories of all kinds. Today, more than 80% of information published on the Web is generated by underlying databases (however access is granted through a Web gateway using forms as a query language and HTML as a display vehicle) and this proportion keeps increasing. But Web data sources also consist of standalone HTML pages hand-coded by individuals, that provide very useful information such as reviews, digests, links, etc. As for the information that also exists in underlying databases, the HTML interface is often the only one available for many would-be clients.


References

Note: References link to DBLP on the Web.

[1]
Brad Adelberg : NoDoSE - A Tool for Semi-Automatically Extracting Semi-Structured Data from Text Documents. SIGMOD Conference 1998 : 283-294
[2]
...
[3]
Gustavo O. Arocena , Alberto O. Mendelzon : WebOQL: Restructuring Documents, Databases, and Webs. ICDE 1998 : 24-33
[4]
...
[5]
Susan B. Davidson , G. Christian Overton , Val Tannen , Limsoon Wong : BioKleisli: A Digital Library for Biomedical Researchers. Int. J. on Digital Libraries 1(1) : 36-53(1997)
[6]
Jean-Robert Gruser , Louiqa Raschid , M. E. Vidal , Laura Bright : Wrapper Generation for Web Accessible Data Sources. CoopIS 1998 : 14-23
[7]
...
[8]
Gerald Huck , Peter Fankhauser , Karl Aberer , Erich J. Neuhold : Jedi: Extracting and Synthesizing Information from the Web. CoopIS 1998 : 32-43
[9]
...
[10]
...
[11]
...
[12]
Arnaud Sahuguet , Fabien Azavant : Web Ecology: Recycling HTML Pages as XML Documents Using W4F. WebDB (Informal Proceedings) 1999 : 31-36
[13]
W3C: Document Oject Model (DOM). http://www.w3.org/DOM/

BIBTEX

@inproceedings{DBLP:conf/vldb/SahuguetA99,
  author    = {Arnaud Sahuguet and
                Fabien Azavant},
   editor    = {Malcolm P. Atkinson and
                Maria E. Orlowska and
                Patrick Valduriez and
                Stanley B. Zdonik and
                Michael L. Brodie},
   title     = {Building Light-Weight Wrappers for Legacy Web Data-Sources Using
                W4F},
   booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
                Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
                UK},
   publisher = {Morgan Kaufmann},
   year      = {1999},
   isbn      = {1-55860-615-5},
   pages     = {738-741},
   crossref  = {DBLP:conf/vldb/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM