Efficient query evaluation on large textual collections in a peer-to-peer environment

Jiangong Zhang, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We study the problem of evaluating ranked (top-k) queries on textual collections ranging from multiple gigabytes to terabytes in size. We focus on the case of a global index organization in a highly distributed environment, and consider a class of ranking functions that includes common variants of the Cosine and Okapi measures. The main bottleneck in such a scenario is the amount of communication required during query evaluation. We propose several efficient query evaluation schemes and evaluate their performance. Our results on real search engine query traces and over 120 million web pages show that after careful optimization such queries can be evaluated at a reasonable cost, while challenges remain for even larger collections and more general classes of ranking functions.

    Original languageEnglish (US)
    Title of host publicationProceedings - Fifth IEEE International Conference on Peer-to-Peer Computing, P2P 2005
    Pages225-233
    Number of pages9
    DOIs
    StatePublished - 2005
    Event5th IEEE International Conference on Peer-to-Peer Computing, P2P 2005 - onstanz, Germany
    Duration: Aug 31 2005Sep 2 2005

    Publication series

    NameProceedings - Fifth IEEE International Conference on Peer-to-Peer Computing, P2P 2005
    Volume2005

    Other

    Other5th IEEE International Conference on Peer-to-Peer Computing, P2P 2005
    Country/TerritoryGermany
    Cityonstanz
    Period8/31/059/2/05

    ASJC Scopus subject areas

    • General Engineering

    Fingerprint

    Dive into the research topics of 'Efficient query evaluation on large textual collections in a peer-to-peer environment'. Together they form a unique fingerprint.

    Cite this