Unsupervised learning of spatiotemporally coherent metrics

Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann Lecun

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Current state-of-the-art classification and detection algorithms train deep convolutional networks using labeled data. In this work we study unsupervised feature learning with convolutional networks in the context of temporally coherent unlabeled data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity priors. We establish a connection between slow feature learning and metric learning. Using this connection we define "temporal coherence" - a criterion which can be used to set hyper-parameters in a principled and automated manner. In a transfer learning experiment, we show that the resulting encoder can be used to define a more semantically coherent metric without the use of labels.

Original languageEnglish (US)
Title of host publication2015 International Conference on Computer Vision, ICCV 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9781467383912
StatePublished - Feb 17 2015
Event15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile
Duration: Dec 11 2015Dec 18 2015

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
Volume2015 International Conference on Computer Vision, ICCV 2015
ISSN (Print)1550-5499


Other15th IEEE International Conference on Computer Vision, ICCV 2015

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Unsupervised learning of spatiotemporally coherent metrics'. Together they form a unique fingerprint.

Cite this