Abstract
Predicting depth is an essential component in understanding the 3D geometry of a scene. While for stereo images local correspondence suffices for estimation, finding depth relations from a single image is less straightforward, requiring integration of both global and local information from various cues. Moreover, the task is inherently ambiguous, with a large source of uncertainty coming from the overall scale. In this paper, we present a new method that addresses this task by employing two deep network stacks: one that makes a coarse global prediction based on the entire image, and another that refines this prediction locally. We also apply a scale-invariant error to help measure depth relations rather than scale. By leveraging the raw datasets as large sources of training data, our method achieves state-of-the-art results on both NYU Depth and KITTI, and matches detailed depth boundaries without the need for superpixelation.
Original language | English (US) |
---|---|
Title of host publication | Advances in Neural Information Processing Systems |
Publisher | Neural information processing systems foundation |
Pages | 2366-2374 |
Number of pages | 9 |
Volume | 3 |
Edition | January |
State | Published - 2014 |
Event | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada Duration: Dec 8 2014 → Dec 13 2014 |
Other
Other | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 12/8/14 → 12/13/14 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing