![]() |
![]() |
![]() |
@article{DBLP:journals/debu/BrinMPW98, author = {Sergey Brin and Rajeev Motwani and Lawrence Page and Terry Winograd}, title = {What can you do with a Web in your Pocket?}, journal = {IEEE Data Eng. Bull.}, volume = {21}, number = {2}, year = {1998}, pages = {37-47}, ee = {db/journals/debu/BrinMPW98.html}, bibsource = {DBLP, http://dblp.uni-trier.de} }BibTeX
The amount of information available online has grown enormously over the past decade. Fortunately, computing power, disk capacity, and network bandwidth have also increased dramatically. It is currently possible for a university research project to store and process the entire World Wide Web. Since there is a limit on how much text humans can generate, it is plausible that within a few decades one will be able to store and process all the human-generated text on the Web in a shirt pocket.
The Web is a very rich and interesting data source. In this paper, we describe the Stanford WebBase, a local repository of a significant portion of the Web. Furthermore, we describe a number of recent experiments that leverage the size and the diversity of the WebBase. First, we have largely automated the process of extracting a sizable relation of books (title, author pairs) from hundreds of data sources spread across the World Wide Web using a technique we call Dual Iterative Pattern Relation Extraction. Second, we have developed a global ranking of Web pages called PageRank based on the link structure of the Web that has properties that are useful for search and navigation. Third, we have used PageRank to develop a novel search engine called Google, which also makes heavy use of anchor text. All of these experiments rely significantly on the size and diversity of the WebBase.
Copyright © 1998 by The Institute of Electrical and Electronic Engineers, Inc. (IEEE). Abstract used with permission.