Predicting object dynamics in scenes

David F. Fouhey, C. Lawrence Zitnick

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given a static scene, a human can trivially enumerate the myriad of things that can happen next and characterize the relative likelihood of each. In the process, we make use of enormous amounts of commonsense knowledge about how the world works. In this paper, we investigate learning this commonsense knowledge from data. To overcome a lack of densely annotated spatiotemporal data, we learn from sequences of abstract images gathered using crowdsourcing. The abstract scenes provide both object location and attribute information. We demonstrate qualitatively and quantitatively that our models produce plausible scene predictions on both the abstract images, as well as natural images taken from the Internet.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
Pages2027-2034
Number of pages8
ISBN (Electronic)9781479951178, 9781479951178
DOIs
StatePublished - Sep 24 2014
Event27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014 - Columbus, United States
Duration: Jun 23 2014Jun 28 2014

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Other

Other27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014
Country/TerritoryUnited States
CityColumbus
Period6/23/146/28/14

Keywords

  • commonsense knowledge
  • prediction
  • scene dynamics
  • scene understanding

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Predicting object dynamics in scenes'. Together they form a unique fingerprint.

Cite this