Scene semantics from long-term observation of people

Vincent Delaitre, David F. Fouhey, Ivan Laptev, Josef Sivic, Abhinav Gupta, Alexei A. Efros

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our everyday objects support various tasks and can be used by people for different purposes. While object classification is a widely studied topic in computer vision, recognition of object function, i.e., what people can do with an object and how they do it, is rarely addressed. In this paper we construct a functional object description with the aim to recognize objects by the way people interact with them. We describe scene objects (sofas, tables, chairs) by associated human poses and object appearance. Our model is learned discriminatively from automatically estimated body poses in many realistic scenes. In particular, we make use of time-lapse videos from YouTube providing a rich source of common human-object interactions and minimizing the effort of manual object annotation. We show how the models learned from human observations significantly improve object recognition and enable prediction of characteristic human poses in new scenes. Results are shown on a dataset of more than 400,000 frames obtained from 146 time-lapse videos of challenging and realistic indoor scenes.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings
Pages284-298
Number of pages15
EditionPART 6
DOIs
StatePublished - 2012
Event12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy
Duration: Oct 7 2012Oct 13 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 6
Volume7577 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th European Conference on Computer Vision, ECCV 2012
Country/TerritoryItaly
CityFlorence
Period10/7/1210/13/12

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Scene semantics from long-term observation of people'. Together they form a unique fingerprint.

Cite this