ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Join Queries with External Text Sources: Execution and Optimization Techniques.

Surajit Chaudhuri, Umeshwar Dayal, Tak W. Yan: Join Queries with External Text Sources: Execution and Optimization Techniques. SIGMOD Conference 1995: 410-422
@inproceedings{DBLP:conf/sigmod/ChaudhuriDY95,
  author    = {Surajit Chaudhuri and
               Umeshwar Dayal and
               Tak W. Yan},
  editor    = {Michael J. Carey and
               Donovan A. Schneider},
  title     = {Join Queries with External Text Sources: Execution and Optimization
               Techniques},
  booktitle = {Proceedings of the 1995 ACM SIGMOD International Conference on
               Management of Data, San Jose, California, May 22-25, 1995},
  publisher = {ACM Press},
  year      = {1995},
  pages     = {410-422},
  ee        = {http://doi.acm.org/10.1145/223784.223856, db/conf/sigmod/sigmod95-33.html},
  crossref  = {DBLP:conf/sigmod/95},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

Text is a pervasive information type, and many applications require querying over text sources in addition to structured data. This can be facilitated by a loose integration of an extensible database system and a text retrieval system. This paper studies the problem of query processing in such a system. The focus is on a class of conjunctive queries that include joins between the structured data and text data, in addition to selections over these two types of data. We investigate the relevance of previous work on distributed query processing and foreign function optimization. We show that, while several of these techniques can be adapted, the characteristics of text retrieval systems lend themselves to some new techniques. We describe a novel class of join method based on {\em probing} that is especially useful for joins with text systems, and we present a cost model for the various alternative query processing methods. We describe experimental results that confirm the utility of these methods. We show that the space of query plans is extended due to the additional techniques, and we describe an optimization algorithm for searching this extended space. Finally, we argue that the techniques we describe in this paper may be more generally applicable to other types of external data managers loosely integrated with a database system.

Copyright © 1995 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Michael J. Carey, Donovan A. Schneider (Eds.): Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, May 22-25, 1995. ACM Press 1995 BibTeX , SIGMOD Record 24(2), June 1995
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1456 KB]

References

[ACM93]
Serge Abiteboul, Sophie Cluet, Tova Milo: Querying and Updating the File. VLDB 1993: 73-84 BibTeX
[AS91]
Walid G. Aref, Hanan Samet: Optimization for Spatial Query Processing. VLDB 1991: 81-90 BibTeX
[BG92]
Ludger Becker, Ralf Hartmut Güting: Rule-Based Optimization and Query Processing in an Extensible Geometric Database System. ACM Trans. Database Syst. 17(2): 247-303(1992) BibTeX
[BGWR81]
Philip A. Bernstein, Nathan Goodman, Eugene Wong, Christopher L. Reeve, James B. Rothnie Jr.: Query Processing in a System for Distributed Databases (SDD-1). ACM Trans. Database Syst. 6(4): 602-625(1981) BibTeX
[BRG88]
Elisa Bertino, Fausto Rabitti, Simon J. Gibbs: Query Processing in a Multimedia Document System. ACM Trans. Inf. Syst. 6(1): 1-41(1988) BibTeX
[CACS94]
Vassilis Christophides, Serge Abiteboul, Sophie Cluet, Michel Scholl: From Structured Documents to Novel Query Facilities. SIGMOD Conference 1994: 313-324 BibTeX
[CDY]
...
[CGK89]
Danette Chimenti, Ruben Gamboa, Ravi Krishnamurthy: Towards on Open Architecture for LDL. VLDB 1989: 195-203 BibTeX
[CHK+91]
Tim Connors, Waqar Hasan, Curtis P. Kolovson, Marie-Anne Neimat, Donovan A. Schneider, W. Kevin Wilkinson: The Papyrus Integrated Data Server. PDIS 1991: 139 BibTeX
[CM94]
Mariano P. Consens, Tova Milo: Optimizing Queries on Files. SIGMOD Conference 1994: 301-312 BibTeX
[CMU94]
...
[Cor94]
...
[CS93]
Surajit Chaudhuri, Kyuseok Shim: Query Optimization in the Presence of Foreign Functions. VLDB 1993: 529-542 BibTeX
[DH91]
...
[Fal85]
Christos Faloutsos: Access Methods for Text. ACM Comput. Surv. 17(1): 49-74(1985) BibTeX
[Fal92]
...
[GD87]
Goetz Graefe, David J. DeWitt: The EXODUS Optimizer Generator. SIGMOD Conference 1987: 160-172 BibTeX
[GHK92]
Sumit Ganguly, Waqar Hasan, Ravi Krishnamurthy: Query Optimization for Parallel Execution. SIGMOD Conference 1992: 9-18 BibTeX
[Hew92]
...
[HFLP89]
Laura M. Haas, Johann Christoph Freytag, Guy M. Lohman, Hamid Pirahesh: Extensible Query Processing in Starburst. SIGMOD Conference 1989: 377-388 BibTeX
[HHK+93]
Waqar Hasan, Michael L. Heytens, Curtis P. Kolovson, Marie-Anne Neimat, Spyros Potamianos, Donovan A. Schneider: Papyrus GIS Demonstration. SIGMOD Conference 1993: 554-555 BibTeX
[HS93]
Joseph M. Hellerstein, Michael Stonebraker: Predicate Migration: Optimizing Queries with Expensive Predicates. SIGMOD Conference 1993: 267-276 BibTeX
[KMP93]
Alfons Kemper, Guido Moerkotte, Klaus Peithner: A Blackboard Architecture for Query Optimization in Object Bases. VLDB 1993: 543-554 BibTeX
[Lib94]
...
[LMH+85]
...
[LS88]
Clifford A. Lynch, Michael Stonebraker: Extended User-Defined Indexing with Application to Textual Databases. VLDB 1988: 306-317 BibTeX
[LW90]
Wan-Lik Lee, Darrell Woelk: Integration of Text Search with ORION. IEEE Data Eng. Bull. 13(1): 56-62(1990) BibTeX
[MDZ93]
Gail Mitchell, Umeshwar Dayal, Stanley B. Zdonik: Control of an Extensible Query Optimizer: A Planning-Based Approach. VLDB 1993: 517-528 BibTeX
[SAC+79]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 BibTeX
[Sal89]
Gerard Salton: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley 1989, ISBN 0-201-12227-8
BibTeX
[SJGP90]
Michael Stonebraker, Anant Jhingran, Jeffrey Goh, Spyros Potamianos: On Rules, Procedures, Caching and Views in Data Base Systems. SIGMOD Conference 1990: 281-290 BibTeX
[YA94]
Tak W. Yan, Jurgen Annevelink: Integrating a Structured-Text Retrieval System with an Object-Oriented Database System. VLDB 1994: 740-749 BibTeX
[YC85]
...

Referenced by

  1. Roy Goldman, Jennifer Widom: WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web. SIGMOD Conference 2000: 285-296
  2. Tobias Mayr, Praveen Seshadri: Client-Site Query Extensions. SIGMOD Conference 1999: 347-358
  3. Daniela Florescu, Alon Y. Levy, Ioana Manolescu, Dan Suciu: Query Optimization in the Presence of Limited Access Patterns. SIGMOD Conference 1999: 311-322
  4. Praveen Seshadri: Enhanced Abstract Data Types in Object-Relational Databases. VLDB J. 7(3): 130-140(1998)
  5. William W. Cohen: Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity. SIGMOD Conference 1998: 201-212
  6. Praveen Seshadri, Miron Livny, Raghu Ramakrishnan: The Case for Enhanced Abstract Data Types. VLDB 1997: 66-75
  7. Ee-Peng Lim, Ying Lu: Distributed Query Processing for Clustered and Bibliographic Databases. DASFAA 1997: 441-450
  8. Yoshiharu Ishikawa, Takehiro Furudate, Shunsuke Uemura: A Wrapping Architecture for IR Systems to Mediate External Structured Document Sources. DASFAA 1997: 431-440
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:40:27 2009