Local methods for estimating PageRank values

Yen Yu Chen, Qingqing Gan, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    The Google search engine uses a method called PageRank, together with term-based and other ranking techniques, to order search results returned to the user. PageRank uses link analysis to assign a global importance score to each web page, The PageRank scores of all the pages are usually determined off-line in a large-scale computation on the entire hyperlink graph of the web, and several recent studies have focused on improving the efficiency of this computation, which may require multiple hours on a workstation. However, in some scenarios, such as online analysis of link evolution and mining of large web archives such as the Internet Archive, it may be desirable to quickly approximate or update the PageRanks of individual nodes without performing a large-scale computation on the entire graph. We address this problem by studying several methods for efficiently estimating the PageRank score of a particular web page using only a small subgraph of the entire web. In our model, we assume that the graph is accessible remotely via a link database (such as the AltaVista Connectivity Server) or is stored in a relational database that performs lookups on disks to retrieve node and connectivity information. We show that a reasonable estimate of the PageRank value of a node is possible in most cases by retrieving only a moderate number of nodes in the local neighborhood of the node.

    Original languageEnglish (US)
    Title of host publicationCIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management
    EditorsD.A. Evans, L. Gravano, O. Herzog, C. Zhai, M. Ronthaler
    Pages381-389
    Number of pages9
    StatePublished - 2004
    EventCIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management - Washington, DC, United States
    Duration: Nov 8 2004Nov 13 2004

    Other

    OtherCIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management
    CountryUnited States
    CityWashington, DC
    Period11/8/0411/13/04

    Keywords

    • External memory algorithms
    • Link database
    • Link-based ranking
    • Out-of-core
    • Pagerank
    • Search engines

    ASJC Scopus subject areas

    • Business, Management and Accounting(all)

    Fingerprint Dive into the research topics of 'Local methods for estimating PageRank values'. Together they form a unique fingerprint.

  • Cite this

    Chen, Y. Y., Gan, Q., & Suel, T. (2004). Local methods for estimating PageRank values. In D. A. Evans, L. Gravano, O. Herzog, C. Zhai, & M. Ronthaler (Eds.), CIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management (pp. 381-389)