TY - JOUR
T1 - GloMO
T2 - 32nd Conference on Neural Information Processing Systems, NeurIPS 2018
AU - Yang, Zhilin
AU - Zhao, Jake
AU - Dhingra, Bhuwan
AU - He, Kaiming
AU - Cohen, William W.
AU - Salakhutdinov, Ruslan
AU - LeCun, Yann
N1 - Funding Information:
This work was supported in part by the Office of Naval Research, DARPA award D17AP00001, Apple, the Google focused award, and the Nvidia NVAIL award. ZY is supported by the Nvidia PhD Fellowship. The authors would also like to thank Sam Bowman for useful discussions.
PY - 2018
Y1 - 2018
N2 - Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks. Our proposed transfer learning framework improves performance on various tasks including question answering, natural language inference, sentiment analysis, and image classification. We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.
AB - Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks. Our proposed transfer learning framework improves performance on various tasks including question answering, natural language inference, sentiment analysis, and image classification. We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.
UR - http://www.scopus.com/inward/record.url?scp=85064838782&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064838782&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85064838782
SN - 1049-5258
VL - 2018-December
SP - 8950
EP - 8961
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 2 December 2018 through 8 December 2018
ER -