Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)
Leonid Sigal, Alexandru Balan, Michael Black
Estimation of three-dimensional articulated human pose and motion from images is a central problem in computer vision. Much of the previous work has been limited by the use of crude generative models of humans represented as articu- lated collections of simple parts such as cylinders. Automatic initialization of such models has proved difﬁcult and most approaches assume that the size and shape of the body parts are known a priori. In this paper we propose a method for automatically recovering a detailed parametric model of non-rigid body shape and pose from monocular imagery. Speciﬁcally, we represent the body using a param- eterized triangulated mesh model that is learned from a database of human range scans. We demonstrate a discriminative method to directly recover the model pa- rameters from monocular images using a conditional mixture of kernel regressors. This predicted pose and shape are used to initialize a generative model for more detailed pose and shape estimation. The resulting approach allows fully automatic pose and shape recovery from monocular and multi-camera imagery. Experimen- tal results show that our method is capable of robustly recovering articulated pose, shape and biometric measurements (e.g. height, weight, etc.) in both calibrated and uncalibrated camera environments.