Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

Part of: Advances in Neural Information Processing Systems 27 (NIPS 2014)

This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.