An Interactive Classification of Web Documents by Self-Organizing Maps and Search Engines.
Kenji Hatano, Ryouichi Sano, Yiwei Duan, Katsumi Tanaka:
An Interactive Classification of Web Documents by Self-Organizing Maps and Search Engines.
DASFAA 1999: 35-42@inproceedings{DBLP:conf/dasfaa/HatanoSDT99,
author = {Kenji Hatano and
Ryouichi Sano and
Yiwei Duan and
Katsumi Tanaka},
editor = {Arbee L. P. Chen and
Frederick H. Lochovsky},
title = {An Interactive Classification of Web Documents by Self-Organizing
Maps and Search Engines},
booktitle = {Database Systems for Advanced Applications, Proceedings of the
Sixth International Conference on Database Systems for Advanced
Applications (DASFAA), April 19-21, Hsinchu, Taiwan},
publisher = {IEEE Computer Society},
year = {1999},
isbn = {0-7695-0084-6},
pages = {35-42},
ee = {db/conf/dasfaa/HatanoSDT99.html},
crossref = {DBLP:conf/dasfaa/99},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
In this paper, we propose an effective classification view mechanism for hypertext data such as web documents based on
Kohonen's Self-Organizing Map (SOM) and search engines. Web documents collected by search engines are automatically
classified by SOM and the obtained SOMs are incrementally modified according to the interaction between users and SOMs. At
present, various search engines are designed to retrieve web documents. When we use search engines to retrieve web documents,
we get a lot of answers as ever before, so we have a lot of labors to examine each web document. Therefore, in order to make up
for search engines, we need a function to classify web document corresponding to the user's point of view and their purposes.
Furthermore, we cannot retrieve pertinent web documents by conventional search engines when a specific topic is described by
more than one web document. To solve these problems, we exploited a content-based clustering system for web documents. In
this system, web documents are automatically clustered by their feature vectors produced from web documents or minimal
subgraphs consisting of multiple web documents, and their overview maps are dynamically generated by SOM. Furthermore, we
propose a method by which an obtained SOM is modified by user's interaction such as feedback operations. It is important to
reflect the aim of classification and the purpose of retrieval to this system. In our research, we intend to solve these problems by
providing a view mechanism in which the Basic Units for retrieval and clustering of Web Documents (BUWDs) are changeable by
users and relevance feedback operations enable the generation of an overview map which reflects user needs.
Copyright © 1999 by The Institute of
Electrical and Electronic Engineers, Inc. (IEEE).
Abstract used with permission.
CDROM Version: Load the CDROM "DiSC, Volume 2 Number 1" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Citation Page
References
- [1]
- Rodrigo A. Botafogo, Ehud Rivlin, Ben Shneiderman:
Structural Analysis of Hypertexts: Identifying Hierarchies and Useful Metrics.
ACM Trans. Inf. Syst. 10(2): 142-180(1992) BibTeX
- [2]
- ...
- [3]
- Kenji Hatano, Qing Qian, Katsumi Tanaka:
A SOM-Based Information Organizer for Text and Video Data.
DASFAA 1997: 205-214 BibTeX
- [4]
- ...
- [5]
- ...
- [6]
- ...
- [7]
- Sougata Mukherjea, James D. Foley, Scott E. Hudson:
Interactive Clustering for Navigating in Hypermedia Systems.
ECHT 1994: 136-145 BibTeX
- [8]
- Jitender S. Deogun, Vijay V. Raghavan:
User-Oriented Document Clustering: A Framework for Learning in Information Retrieval.
SIGIR 1986: 157-163 BibTeX
- [9]
- ...
- [10]
- Gerard Salton:
Recent Studies in Automatic Text Analysis and Document Retrieval.
J. ACM 20(2): 258-278(1973) BibTeX
- [11]
- Gerard Salton, James Allan, Chris Buckley:
Automatic Structuring and Retrival of Large Text Files.
Commun. ACM 37(2): 97-108(1994) BibTeX
- [12]
- Keishi Tajima, Yoshiaki Mizuuchi, Masatsugu Kitagawa, Katsumi Tanaka:
Cut as a Querying Unit for WWW, Netnews, e-mail.
Hypertext 1998: 235-244 BibTeX
- [13]
- Ron Weiss, Bienvenido Vélez, Mark A. Sheldon, Chanathip Namprempre, Peter Szilagyi, Andrzej Duda, David K. Gifford:
HyPursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering.
Hypertext 1996: 180-193 BibTeX
- [14]
- Budi Yuwono, Dik Lun Lee:
Search and Ranking Algorithms for Locating Resources on the World Wide Web.
ICDE 1996: 164-171 BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
DASFAA 1999 Proceedings: Copyright © by IEEE,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:05:36 2009