TY - GEN
T1 - A candidate filtering mechanism for fast top-K query processing on modern CPUs
AU - Dimopoulos, Constantinos
AU - Nepomnyachiy, Sergey
AU - Suel, Torsten
PY - 2013
Y1 - 2013
N2 - A large amount of research has focused on faster methods for finding top-k results in large document collections, one of the main scalability challenges for web search engines. In this paper, we propose a method for accelerating such top-k queries that builds on and generalizes methods recently proposed by several groups of researchers based on Block-Max Indexes [15, 10, 13]. In particular, we describe a system that uses a new filtering mechanism, based on a combination of block maxima and bitmaps, that radically reduces the number of documents that have to be further evaluated. Our filtering mechanism exploits the SIMD processing capabilities of current microprocessors, and it is optimized through caching policies that select and store suitable filter structures based on properties of the query load. Our experimental evaluation shows that the mechanism results in very significant speed-ups for disjunctive top-k queries under several state-of-the-art algorithms, including a speed-up of more than a factor of 2 over the fastest previously known methods.
AB - A large amount of research has focused on faster methods for finding top-k results in large document collections, one of the main scalability challenges for web search engines. In this paper, we propose a method for accelerating such top-k queries that builds on and generalizes methods recently proposed by several groups of researchers based on Block-Max Indexes [15, 10, 13]. In particular, we describe a system that uses a new filtering mechanism, based on a combination of block maxima and bitmaps, that radically reduces the number of documents that have to be further evaluated. Our filtering mechanism exploits the SIMD processing capabilities of current microprocessors, and it is optimized through caching policies that select and store suitable filter structures based on properties of the query load. Our experimental evaluation shows that the mechanism results in very significant speed-ups for disjunctive top-k queries under several state-of-the-art algorithms, including a speed-up of more than a factor of 2 over the fastest previously known methods.
KW - Block-max inverted index
KW - Candidate filtering mechanism
KW - DocID-oriented block-max index
KW - Early termination
KW - Live area computation
KW - Posting bitset
KW - Top-k query processing
UR - http://www.scopus.com/inward/record.url?scp=84883116336&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883116336&partnerID=8YFLogxK
U2 - 10.1145/2484028.2484087
DO - 10.1145/2484028.2484087
M3 - Conference contribution
AN - SCOPUS:84883116336
SN - 9781450320344
T3 - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 723
EP - 732
BT - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
T2 - 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013
Y2 - 28 July 2013 through 1 August 2013
ER -