Abstract
Current web search engines are retrospective in that they limit users to searches against already existing pages. Prospective search engines, on the other hand, allow users to upload queries that will be applied to newly discovered pages in the future. Some examples of prospective search are the subscription features in Google News and in RSS-based blog search engines. In this paper, we study the problem of efficiently processing large numbers of keyword query subscriptions against a stream of newly discovered documents, and propose several query processing optimizations for prospective search. Our experimental evaluation shows that these techniques can improve the throughput of a well known algorithm by more than a factor of 20, and allow matching hundreds or thousands of incoming documents per second against millions of subscription queries per node.
Original language | English (US) |
---|---|
Pages | 375-380 |
Number of pages | 6 |
State | Published - 2006 |
Event | 2006 USENIX Annual Technical Conference - Boston, United States Duration: May 30 2006 → Jun 3 2006 |
Conference
Conference | 2006 USENIX Annual Technical Conference |
---|---|
Country | United States |
City | Boston |
Period | 5/30/06 → 6/3/06 |
ASJC Scopus subject areas
- Computer Science(all)