Efficient query subscription processing for prospective search engines

Utku Irmak, Svilen Mihaylov, Torsten Suel, Samrat Ganguly, Rauf Izmailov

    Research output: Contribution to conferencePaper

    Abstract

    Current web search engines are retrospective in that they limit users to searches against already existing pages. Prospective search engines, on the other hand, allow users to upload queries that will be applied to newly discovered pages in the future. Some examples of prospective search are the subscription features in Google News and in RSS-based blog search engines. In this paper, we study the problem of efficiently processing large numbers of keyword query subscriptions against a stream of newly discovered documents, and propose several query processing optimizations for prospective search. Our experimental evaluation shows that these techniques can improve the throughput of a well known algorithm by more than a factor of 20, and allow matching hundreds or thousands of incoming documents per second against millions of subscription queries per node.

    Original languageEnglish (US)
    Pages375-380
    Number of pages6
    StatePublished - 2006
    Event2006 USENIX Annual Technical Conference - Boston, United States
    Duration: May 30 2006Jun 3 2006

    Conference

    Conference2006 USENIX Annual Technical Conference
    CountryUnited States
    CityBoston
    Period5/30/066/3/06

    ASJC Scopus subject areas

    • Computer Science(all)

    Fingerprint Dive into the research topics of 'Efficient query subscription processing for prospective search engines'. Together they form a unique fingerprint.

  • Cite this

    Irmak, U., Mihaylov, S., Suel, T., Ganguly, S., & Izmailov, R. (2006). Efficient query subscription processing for prospective search engines. 375-380. Paper presented at 2006 USENIX Annual Technical Conference, Boston, United States.