Markerless Motion Capture

Input: 4 Video Streams Output: Pose Estimation
The project aims to infer the poses of a character acting in a environment filmed by a set of video cameras. Once the poses are estimated, a free-viewpoint video of the entire action is genertated.
Introductory video: [MOV] [CVPR09 video]




The Blue-Room

The recording studio is a 24 square meters room furnished with blue walls and four high resolution cameras (scA1000-30fc http://www.baslerweb.com/). Each video stream is acquired by a dedicated computer through a 1394b bus while the video syncronization is accomplished by the use of an external hardware trigger. Each video stream is recorded with a resolution of 1034x778 pixels at 21fps. The used optics consist in three 4.5mm lenses and one 3.5mm lens.

Blue-Room tour: [MOV]
The Making-Of the Blue-Room: [YouTube]




The Human Models

Each performer is modeled as an articulated deformable objects and their deformations approximated with linear blend skinning (LBS). Their shape and colors are first recovered using a home-made 3D body scanner consisting in a consumer DSLR rotating all around the character and taking picture in known calibrated positions. A passive 3D reconstruction technique is then applied to recover both the shape and the texture. These are some of the models used in our experiments:

The acquired 3D models count more than 600k faces but, due to memory and speed issues, their resolution was downsampled to 13k faces. Texture resolution is about 6000x3500 pixels.




Reconstructed Sequences

Some example sequences:
Handstand
Simple Break-dancing
Somersault
Soccer Juggling
Kick-Boxing [MOV]
Press-Up [MOV]
Pirouettes [MOV]
More...




Publications

Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes

L. Ballan and G. M. Cortelazzo [PDF] [web] [video] [bibtex]
(Best Paper Award)
3DPVT 2008, Atlanta, GA, USA

Videos

Markerless Motion Capture of Skinned Models

L. Ballan [video]
CVPR 2009