Cooperative Agents for Conceptual Search and Browsing of World Wide Web Resources

Susan Gauch
Electrical Engineering and Computer Science
University of Kansas

Contact Information

Dr. Susan Gauch
Electrical Engineering and Computer Science
415 Snow Hall
University of Kansas
Lawrence, KS 66045
Phone: (785) 864-8817
Fax : (785) 864-3226
Email: sgauch@eecs.ukans.edu

WWW PAGE

Co-operative Agents Project Home Page

Keywords

information retrieval, co-operative agents, classification, query routing, distributed browsing, visualization

Project Award Information

Cooperative Agents for Conceptual Search and Browsing of World Wide Web Resources, Year one of a 4 year grant, 1997-2001.

Project Summary

Our primary goals are four fold: Our approach employs co-operating, distributed, intelligent agents to organize information on the Web. Each participating Web site uses local agents to characterize the local information with respect to a conceptual ontology. Further local agents support search, browsing and visualization of the local information. By sharing summarized content-based meta-information with regional agents, regional browsing, search and visualization can be supported.

Goals, Objectives, and Targeted Activities

We have completed the initial versions of the local characterizing agent which, given an ontology, weights the nodes of the ontology with respect to information collected from the local site. We have also completed the initial versions of the local search agent and the local browsing agent (which allows users to browse the weighted ontology and see the Weg pages automatically attached to each node). Our goals for next year are to develop initial prototypes of the regional agents and evaluate the effectiveness of our distributed approach with respect to a single centralized collection of Web pages.

Indication of Success

We have been able to characterize several different Web sites with respect to our ontology entirely automatically. In addition, a content-based browsing structure for those sites has been built entirely automatically. Informal evaluations have found that the weights on the nodes provide useful clues for users and that the node weights reflect the site contents. Formal evaluation experiments are planned for the upcoming year. In particular, the ability of the weighted ontologies to form the basis of query routing will be a major focus for 1998-1999. Long range impacts may be the development of much higher precision, more focused searching for online Web resources. Queries can be sent to only the most promising sites for processing, rather than using the current search approach as treating all documents from all sites equally.

Project Impact and Outcome

Two graduate students and one undergraduate student have been supported on this award. One graduate student has since graduated and is now working in industry and the other, a female student, is pursuing her Ph.D. Our exposure to the importance of modern graphical user interfaces has led to a change in our undergraduate curriculum to feature Java in our introductory courses. In addition, a new course on Information Retrieval (EECS 767) has been added to our regular offerings. There is potential for technology transfer to ProFusion, L.L.C., to enhance the query routing capabilities of the ProFusion meta-search engine.

Project References

Intelligent Information Agents for the World Wide Web, Edgar Casasola and Susan Gauch. Information and Telecommunication Technology Center, Technical Report ITTC-FY97-TR-11100-1.

Intelligent Information Agents: Review and Challenges for Distributed Information Sources, Donna Haverkamp and Susan Gauch. To appear in: Journal of the American Society for Information Science.

An Adaptive Multi-Agent Architecture for the ProFusion* Meta Search System, Yizhong Fan and Susan Gauch, Proc. of WebNet '97: The Second World Conference of the Web Society, Toronto, ON, November 1997.

ProFusion: Intelligent Fusion from Multiple, Distributed Search Engines, Susan Gauch, Guijun Wang and Mario Gomez, Journal of Universal Computing, Springer-Verlag, Vol. 2 (9), Sept. 1996.

Information Fusion With ProFusion, Susan Gauch and Guijun Wang, Proc. of WebNet '96: The First World Conference of the Web Society, San Francisco, CA, October 1996. pp. 174-179.

Area Background

Initially, information on the World Wide Web was found by random browsing, a time consuming and ineffective method. Now, all-encompassing on-line search engines make finding information much easier. In fact, it has become too easy. It is not unusual to find 4,000 items or more which match a given query. What is needed is not just a search engine that produces better results but rather an organization of the search results based on the concepts contained in the various Web pages. The search process must also become more distributed, removing the incredible demand placed on the handful of popular search sites. In addition, searching is not the only desirable way to access information - users should be able to browse through the Web in an organized manner. Several search engines provide subject hierarchies which can be browsed, but the associated Web pages are manually placed in the categories which limits the amount of information available. In addition, each time the user visits a new node in the ontology, a page must be sent from the centralized server, making browsing a tedious activity. Finally, both searching and browsing allow users to view only a tiny portion of the World Wide Web. They are effective means of getting a detailed, partial picture but do not provide an abstraction or overview of the Web as a whole from a conceptual level: What kind of information is available today? How is the information available on the Web changing? These broader questions are essentially unanswerable today.

The main open questions that this area related to this research are the following:

How can ontologies be built and/or modified to reflect local site contents?
How can ontologies be merged to reflect regional contents?
What should the topology of the local and regional agents be?
How does distributing the search process affect the quality of the retrieval sets?
What is the quality of the automatically built browsing structure for the Web?
How can we visualize large portions of the Web?
How can we visualize the adaptation of the Web over time?
To address these issues, we need to learn more about agent communication, information summarization and information presentation.

In general, an active area of research is the use of search agents. Some search agents are specifically designed to locate information available from various home pages on the World Wide Web. These agents mask the complexity of the Information Superhighway and filter the exploding amount of information available. ProFusion [Gauch et al, 1996], for example, contains a mediator agent which interacts with information agents representing underlying Web search engines. The mediator selects which information agents/search engines are best for individual queries, and fuses the multiple search results.

Many other projects create agents which act on behalf of individual users. Webwatcher [Armstrong et al, 1995], Letizia [Lieberman, 1995] and [Balabanovic and Shoham, 1995] incorporate learning from relevance feedback to assist the user in browsing the Web for interesting pages. Amalthaea [Moukas, 1996] searches the Web on behalf of a user. It spawns many agents per user (approximately 400) to query Web search engines and filter the results, employing genetic algorithms for query generation. The Internet Softbot [Etzioni and Weld, 1995] differs from previous systems in that it accepts high-level user goals and decides on the sequence of actions (Internet commands) required to satisfy those goals. It deals with structured information services such as weather map servers and stock quote servers, and it can handle notification requests by monitoring any of a variety of events and reporting important occurrences to the user autonomously.

Area References

Armstrong, R., Freitag, D., Joachims, T., and Mitchell, T. "Webwatcher: A Learning Apprentice for the World Wide Web," Proc. of the Symposium on Information Gathering from Heterogenous, Distributed Environments, AAAI Press, 1995.

Balanovic, M. and Shoham, Y. "Learning Information Retrieval Agents: Experiments with Automated Web Browsing,S AAAI Technical Report SS-95-08, Proc. of the 1995 AAAI Spring Symposium Series, AAAI Press, 1995.

Etzioni O. and Weld, D. "Intelligent Agents on the Internet: Fact, Fiction, and Forecast," IEEE Expert, vol. 10, no. 4, 1995, pp. 44-49.

Gauch, S., Wang, G. and Gomez, M. "ProFusion: Intelligent Fusion from Multiple, Distributed Search Engines," Journal of Universal Computer Science, Vol. 2 , No. 9, September 1996, 637-649.

Lieberman, H. "Letizia, an Agent that Assists Web Browsing,S Proc. of IJCAI-95 AAAI Press, 1995.

Moukas, A. "Amalthaea: Information Discovery and Filtering using a Multiagent Evolving Ecosystem,S Proc. of the Conf. on the Practical Application of Intelligent Agents and Multi-Agent Technology, London, 1996.

Potential Related Projects

Related work on content identification is being done by award #9712069, "Automatic Identification of Significant Topics in Domain Independent Full Text Analysis," and also work on retrieving data from distributed sources in award #9712239, "Integration of Information from Internet Sources". This project is also relevant to the Knowledge and Distributed Intelligence program.