TY - GEN
T1 - Structural sentence similarity estimation for short texts
AU - Ma, Weicheng
AU - Suel, Torsten
PY - 2016
Y1 - 2016
N2 - Sentence similarity is the basis of most text-related tasks. In this paper, we define a new task of sentence similarity estimation specifically for short while informal, social-network styled sentences. The new type of sentence similarity, which we call Structural Similarity, eliminates syntactic or grammatical features such as dependency paths and Part-of-Speech (POS) tagging which do not have enough representativeness on short sentences. Structural Similarity does not consider actual meanings of the sentences either but puts more emphasis on the similarities of sentence structures, so as to discover purpose- or emotion-level similarities. The idea is based on the observation that people tend to use sentences with similar structures to express similar feelings. Besides the definition, we present a new feature set and a mechanism to calculate the scores, and, for the needs of disambiguating word senses we propose a variant of the Word2Vec model to represent words. We prove the correctness and advancement of our sentence similarity measurement by experiments.
AB - Sentence similarity is the basis of most text-related tasks. In this paper, we define a new task of sentence similarity estimation specifically for short while informal, social-network styled sentences. The new type of sentence similarity, which we call Structural Similarity, eliminates syntactic or grammatical features such as dependency paths and Part-of-Speech (POS) tagging which do not have enough representativeness on short sentences. Structural Similarity does not consider actual meanings of the sentences either but puts more emphasis on the similarities of sentence structures, so as to discover purpose- or emotion-level similarities. The idea is based on the observation that people tend to use sentences with similar structures to express similar feelings. Besides the definition, we present a new feature set and a mechanism to calculate the scores, and, for the needs of disambiguating word senses we propose a variant of the Word2Vec model to represent words. We prove the correctness and advancement of our sentence similarity measurement by experiments.
UR - http://www.scopus.com/inward/record.url?scp=85004010353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85004010353&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85004010353
T3 - Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
SP - 232
EP - 237
BT - Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
A2 - Markov, Zdravko
A2 - Russell, Ingrid
PB - AAAI press
T2 - 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
Y2 - 16 May 2016 through 18 May 2016
ER -