TY - GEN
T1 - Learning distributed representations of sentences from unlabelled data
AU - Hill, Felix
AU - Cho, Kyunghyun
AU - Korhonen, Anna
N1 - Publisher Copyright:
©2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-bilinear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.
AB - Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-bilinear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.
UR - http://www.scopus.com/inward/record.url?scp=84994157681&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994157681&partnerID=8YFLogxK
U2 - 10.18653/v1/n16-1162
DO - 10.18653/v1/n16-1162
M3 - Conference contribution
AN - SCOPUS:84994157681
T3 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference
SP - 1367
EP - 1377
BT - 2016 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016
Y2 - 12 June 2016 through 17 June 2016
ER -