Optimized Query Execution in Large Search Engines with Global Page Ordering

Xiaohui Long, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingChapter

    Abstract

    This chapter discusses the optimized query execution in large search engines with global page ordering. Large web search engines have to answer thousands of queries per second with interactive response times. A major factor in the cost of executing a query is given by the lengths of the inverted lists for the query terms, which increase with the size of the document collection and are often in the range of many megabytes. To address this issue, information retrieval (IR) and database researchers have proposed pruning techniques that compute or approximate term-based ranking functions without scanning over the full inverted lists. This chapter focuses on the question of how such techniques can be efficiently integrated into query processing. It studies pruning techniques for query execution in large engines in the case where one has a global ranking of pages, as provided by Pagerank or any other method, in addition to the standard term-based approach. The chapter describes pruning schemes for this case and evaluates their efficiency on an experimental cluster-based search engine with 120 million web pages. The results show that there is significant potential benefit in such techniques.

    Original languageEnglish (US)
    Title of host publicationProceedings 2003 VLDB Conference
    Subtitle of host publication29th International Conference on Very Large Databases (VLDB)
    PublisherElsevier
    Pages129-140
    Number of pages12
    ISBN (Electronic)9780127224428
    DOIs
    StatePublished - Jan 1 2003

    ASJC Scopus subject areas

    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Optimized Query Execution in Large Search Engines with Global Page Ordering'. Together they form a unique fingerprint.

    Cite this