Top-k aggregation using intersections of ranked inputs

Ravi Kumar, Kunal Punera, Torsten Suel, Sergei Vassilvitskii

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    There has been considerable past work on efficiently computing top k objects by aggregating information from multiple ranked lists of these objects. An important instance of this problem is query processing in search engines: One has to combine information from several different posting lists (rankings) of web pages (objects) to obtain the top k web pages to answer user queries. Two particularly well-studied approaches to achieve efficiency in top-k aggregation include early-termination algorithms (e.g., TA and NRA) and pre-aggregation of some of the input lists. However, there has been little work on a rigorous treatment of combining these approaches. We generalize the TA and NRA algorithms to the case when pre-aggregated intersection lists are available in addition to the original lists. We show that our versions of TA and NRA continue to remain "instance optimal," a very strong optimality notion that is a highlight of the original TA and NRA algorithms. Using an index of millions of web pages and real-world search engine queries, we empirically characterize the performance gains offered by our new algorithms. We show that the practical benefits of intersection lists can be fully realized only with an early-termination algorithm.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09
    Pages222-231
    Number of pages10
    DOIs
    StatePublished - 2009
    Event2nd ACM International Conference on Web Search and Data Mining, WSDM'09 - Barcelona, Spain
    Duration: Feb 9 2009Feb 12 2009

    Publication series

    NameProceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09

    Other

    Other2nd ACM International Conference on Web Search and Data Mining, WSDM'09
    CountrySpain
    CityBarcelona
    Period2/9/092/12/09

    Keywords

    • Early-termination
    • Intersections
    • NRA
    • TA

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Software

    Fingerprint Dive into the research topics of 'Top-k aggregation using intersections of ranked inputs'. Together they form a unique fingerprint.

  • Cite this

    Kumar, R., Punera, K., Suel, T., & Vassilvitskii, S. (2009). Top-k aggregation using intersections of ranked inputs. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09 (pp. 222-231). (Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09). https://doi.org/10.1145/1498759.1498830