Learning a predictable and generative vector representation for objects

Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.

Original languageEnglish (US)
Title of host publicationComputer Vision - 14th European Conference, ECCV 2016, Proceedings
EditorsBastian Leibe, Jiri Matas, Nicu Sebe, Max Welling
PublisherSpringer Verlag
Pages484-499
Number of pages16
ISBN (Print)9783319464657
DOIs
StatePublished - 2016
Event14th European Conference on Computer Vision, ECCV 2016 - Amsterdam, Netherlands
Duration: Oct 8 2016Oct 16 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9910 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th European Conference on Computer Vision, ECCV 2016
Country/TerritoryNetherlands
CityAmsterdam
Period10/8/1610/16/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Learning a predictable and generative vector representation for objects'. Together they form a unique fingerprint.

Cite this