TY - GEN
T1 - Querying Wikipedia documents and relationships
AU - Nguyen, Huong
AU - Nguyen, Thanh
AU - Nguyen, Hoa
AU - Freire, Juliana
PY - 2010
Y1 - 2010
N2 - Wikipedia has become an important source of information which is growing very rapidly. However, the existing infrastructure for querying this information is limited and often ignores the inherent structure in the information and links across documents. In this paper, we present a new approach for querying Wikipedia content that supports a simple, yet expressive query interfaces that allow both keyword and structured queries. A unique feature of our approach is that, besides returning documents that match the queries, it also exploits relationships among documents to return richer, multi-document answers. We model Wikipedia as a graph and cast the problem of finding answers for queries as graph search. To guide the answer-search process, we propose a novel weighting scheme to identify important nodes and edges in the graph. By leveraging the structured information available in infoboxes, our approach supports queries that specify constraints over this structure, and we propose a new search algorithm to support these queries. We evaluate our approach using a representative subset of Wikipedia documents and present results which show that our approach is effective and derives high-quality answers.
AB - Wikipedia has become an important source of information which is growing very rapidly. However, the existing infrastructure for querying this information is limited and often ignores the inherent structure in the information and links across documents. In this paper, we present a new approach for querying Wikipedia content that supports a simple, yet expressive query interfaces that allow both keyword and structured queries. A unique feature of our approach is that, besides returning documents that match the queries, it also exploits relationships among documents to return richer, multi-document answers. We model Wikipedia as a graph and cast the problem of finding answers for queries as graph search. To guide the answer-search process, we propose a novel weighting scheme to identify important nodes and edges in the graph. By leveraging the structured information available in infoboxes, our approach supports queries that specify constraints over this structure, and we propose a new search algorithm to support these queries. We evaluate our approach using a representative subset of Wikipedia documents and present results which show that our approach is effective and derives high-quality answers.
UR - http://www.scopus.com/inward/record.url?scp=78650463127&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650463127&partnerID=8YFLogxK
U2 - 10.1145/1859127.1859133
DO - 10.1145/1859127.1859133
M3 - Conference contribution
AN - SCOPUS:78650463127
SN - 9781450301862
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
BT - Proceedings of the 13th International Workshop on the Web and Databases, WebDB 2010, Co-located with ACM SIGMOD 2010
PB - Association for Computing Machinery
T2 - 13th International Workshop on the Web and Databases, WebDB 2010, Co-located with ACM SIGMOD 2010
Y2 - 6 June 2010 through 6 June 2010
ER -