Human Pose Estimation with Fields of Parts

Martin Kiefel and Peter V. Gehler
MPI for Intelligent Systems, Tübingen, Germany
ECCV 2014 – European Conference on Computer Vision

This paper proposes a new formulation of the human pose estimation problem. We present the Fields of Parts model, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images.

The Fields of Parts model is inspired by the idea of Pictorial Structures, it models local appearance and joint spatial configuration of the human body. However the underlying graph structure is entirely different. The idea is simple: we model the presence and absence of a body part at every possible position, orientation, and scale in an image with a binary random variable. This results into a vast number of random variables, however, we show that approximate inference in this model is efficient. Moreover we can encode the very same appearance and spatial structure as in Pictorial Structures models.

This approach allows us to combine ideas from segmentation and pose estimation into a single model. The Fields of Parts model can use evidence from the background, include local color information, and it is connected more densely than a kinematic chain structure. On the challenging Leeds Sports Poses dataset we improve over the Pictorial Structures counterpart by 6.0% in terms of Average Precision of Keypoints.

From Pictorial Structure (left) to the
        Fields of Parts model (right)

From Pictorial Structure models (left) to the Fields of Parts model (right). For each body part in the PS model we introduce a field of binary random variables, one for each of its states. When two body parts are connected by a pairwise factor (left) we densely connect the corresponding fields (right), illustrated by the stacked factors. The binary variables 0/1 encode absence or presence of a body part at its location and type (rotation). This is a dense graph and thus contains multiple cycles.