A candidate filtering mechanism for fast top-K query processing on modern CPUs

Constantinos Dimopoulos, Sergey Nepomnyachiy, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A large amount of research has focused on faster methods for finding top-k results in large document collections, one of the main scalability challenges for web search engines. In this paper, we propose a method for accelerating such top-k queries that builds on and generalizes methods recently proposed by several groups of researchers based on Block-Max Indexes [15, 10, 13]. In particular, we describe a system that uses a new filtering mechanism, based on a combination of block maxima and bitmaps, that radically reduces the number of documents that have to be further evaluated. Our filtering mechanism exploits the SIMD processing capabilities of current microprocessors, and it is optimized through caching policies that select and store suitable filter structures based on properties of the query load. Our experimental evaluation shows that the mechanism results in very significant speed-ups for disjunctive top-k queries under several state-of-the-art algorithms, including a speed-up of more than a factor of 2 over the fastest previously known methods.

    Original languageEnglish (US)
    Title of host publicationSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
    Pages723-732
    Number of pages10
    DOIs
    StatePublished - 2013
    Event36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 - Dublin, Ireland
    Duration: Jul 28 2013Aug 1 2013

    Publication series

    NameSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

    Other

    Other36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013
    Country/TerritoryIreland
    CityDublin
    Period7/28/138/1/13

    Keywords

    • Block-max inverted index
    • Candidate filtering mechanism
    • DocID-oriented block-max index
    • Early termination
    • Live area computation
    • Posting bitset
    • Top-k query processing

    ASJC Scopus subject areas

    • Computer Graphics and Computer-Aided Design
    • Information Systems

    Fingerprint

    Dive into the research topics of 'A candidate filtering mechanism for fast top-K query processing on modern CPUs'. Together they form a unique fingerprint.

    Cite this