Abstract
We describe a novel method for simultaneously detecting faces and estimating their pose in real time. The method employs a convolutional network to map images of faces to points on a low-dimensional manifold parametrized by pose, and images of non-faces to points far away from that manifold. Given an image, detecting a face and estimating its pose is viewed as minimizing an energy function with respect to the face/non-face binary variable and the continuous pose parameters. The system is trained to minimize a loss function that drives correct combinations of labels and pose to be associated with lower energy values than incorrect ones. The system is designed to handle very large range of poses without retraining. The performance of the system was tested on three standard data sets - for frontal views, rotated faces, and profiles - is comparable to previous systems that are designed to handle a single one of these data sets. We show that a system trained simuiltaneously for detection and pose estimation is more accurate on both tasks than similar systems trained for each task separately.
Original language | English (US) |
---|---|
Pages (from-to) | 1197-1215 |
Number of pages | 19 |
Journal | Journal of Machine Learning Research |
Volume | 8 |
State | Published - May 2007 |
Keywords
- Convolutional networks
- Energy based models
- Face detection
- Object recognition
- Pose estimation
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence