ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

Subtopic Structuring for Full-Length Document Access.

Marti A. Hearst, Christian Plaunt: Subtopic Structuring for Full-Length Document Access. SIGIR 1993: 59-68
@inproceedings{DBLP:conf/sigir/HearstP93,
  author    = {Marti A. Hearst and
               Christian Plaunt},
  editor    = {Robert Korfhage and
               Edie M. Rasmussen and
               Peter Willett 0002},
  title     = {Subtopic Structuring for Full-Length Document Access},
  booktitle = {Proceedings of the 16th Annual International ACM-SIGIR Conference
               on Research and Development in Information Retrieval. Pittsburgh,
               PA, USA, June 27 - July 1, 1993},
  publisher = {ACM},
  year      = {1993},
  isbn      = {0-89791-605-0},
  pages     = {59-68},
  ee        = {db/conf/sigir/HearstP93.html},
  crossref  = {DBLP:conf/sigir/93},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

We argue that the advent of large volumes of full-length text, as opposed to short texts like abstracts and newswire, should be accompanied by corresponding new approaches to information access. Toward this end, we discuss the merits of imposing structure on full-length text documents; that is, a partition of the text into coherent multi-paragraph units that represent the pattern of subtopics that comprise the text. Using this structure, we can make a distinction between the main topics, which occur throughout the length of the text, and the subtopics, which are of only limited extent. We discuss why recognition of subtopic structure is important and how, to some degree of accuracy, it can be found. We describe a new way of specifying queries on full-length documents and then describe an experiment in which making use of the recognition of local structure achieves better results on a typical information retrieval task than does a standard IR measure.

Copyright © 1993 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Robert Korfhage, Edie M. Rasmussen, Peter Willett (Eds.): Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27 - July 1, 1993. ACM 1993, ISBN 0-89791-605-0
Contents BibTeX

Online Edition: ACM Digital Library

Citation page

Referenced by

  1. Marc Volz, Karl Aberer, Klemens Böhm: An OODBMS-IRS Coupling for Structured Documents. IEEE Data Eng. Bull. 19(1): 34-42(1996)
  2. Yasushi Ogawa: Effective & Efficient Document Ranking without using a Large Lexicon. VLDB 1996: 192-202
  3. Marc Volz, Karl Aberer, Klemens Böhm: Applying a Flexible OODBMS-IRS-Coupling for Structured Document Handling. ICDE 1996: 10-19
  4. Brian Lowe, Justin Zobel, Ron Sacks-Davis: A Formal Model for Databases of Structured Text. DASFAA 1995: 449-456
  5. François Paradis: Using Linguistic and Discourse Structures to Derive Topics. CIKM 1995: 44-49
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:42 2009