Compressing the graph structure of the Web

T. Suel, J. Yuan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A large amount of research has recently focused on the graph structure (or link structure) of the World Wide Web. This structure has proven to be extremely useful for improving the performance of search engines and other tools for navigating the web. However, since the graphs in these scenarios involve hundreds of millions of nodes and even more edges, highly space-efficient data structures are needed to fit the data in memory. A first step in this direction was done by the DEC Connectivity Server, which stores the graph in compressed form. In this paper, we describe techniques for compressing the graph structure of the web, and give experimental results of a prototype implementation. We attempt to exploit a variety of different sources of compressibility of these graphs and of the associated set of URLs in order to obtain good compression performance on a large web graph.

    Original languageEnglish (US)
    Title of host publicationData Compression Conference Proceedings
    EditorsJ.A. Storer, M. Cohn
    Pages213-222
    Number of pages10
    StatePublished - 2001
    EventData Compression Conference - Snowbird, UT, United States
    Duration: Mar 27 2001Mar 29 2001

    Other

    OtherData Compression Conference
    Country/TerritoryUnited States
    CitySnowbird, UT
    Period3/27/013/29/01

    ASJC Scopus subject areas

    • Hardware and Architecture
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Compressing the graph structure of the Web'. Together they form a unique fingerprint.

    Cite this