ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

A Sequential Algorithm for Training Text Classifiers.

David D. Lewis, William A. Gale: A Sequential Algorithm for Training Text Classifiers. SIGIR 1994: 3-12
@inproceedings{DBLP:conf/sigir/LewisG94,
  author    = {David D. Lewis and
               William A. Gale},
  editor    = {W. Bruce Croft and
               C. J. van Rijsbergen},
  title     = {A Sequential Algorithm for Training Text Classifiers},
  booktitle = {Proceedings of the 17th Annual International ACM-SIGIR Conference
               on Research and Development in Information Retrieval. Dublin,
               Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum)},
  publisher = {ACM/Springer},
  year      = {1994},
  isbn      = {3-540-19889-X},
  pages     = {3-12},
  ee        = {db/conf/sigir/LewisG94.html},
  crossref  = {DBLP:conf/sigir/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be manually classified to achieve a given level of effectiveness.

Copyright © 1994 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

W. Bruce Croft, C. J. van Rijsbergen (Eds.): Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum). ACM/Springer 1994, ISBN 3-540-19889-X
Contents BibTeX

Online Edition: ACM Digital Library

Citation page

Referenced by

  1. Oren Etzioni: The World-Wide Web: Quagmire or Gold Mine? Commun. ACM 39(11): 65-68(1996)
  2. Markus Tresch, Neal Palmer, Allen Luniewski: Type Classification of Semi-Structured Documents. VLDB 1995: 263-274
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:45 2009