Synergistic Face Detection and Pose Estimation with Energy-Based Models
Margarita Osadchy, Yann Le Cun, Matthew L. Miller; 8(May):1197--1215, 2007.
We describe a novel method for simultaneously detecting faces and estimating their pose in real time. The method employs a convolutional network to map images of faces to points on a low-dimensional manifold parametrized by pose, and images of non-faces to points far away from that manifold. Given an image, detecting a face and estimating its pose is viewed as minimizing an energy function with respect to the face/non-face binary variable and the continuous pose parameters. The system is trained to minimize a loss function that drives correct combinations of labels and pose to be associated with lower energy values than incorrect ones.
The system is designed to handle very large range of poses without retraining. The performance of the system was tested on three standard data sets---for frontal views, rotated faces, and profiles---is comparable to previous systems that are designed to handle a single one of these data sets.
We show that a system trained simuiltaneously for detection and pose estimation is more accurate on both tasks than similar systems trained for each task separately.