3D motion capture by computer vision and virtual rendering

Networked 3D virtual environments allow multiple users to interact with each other over the Internet. Users can share some sense of telepresence by remotely animating an avatar that represents them. However, avatar control may be tedious and still render user gestures poorly. This work aims at animating a user?s avatar from real time 3D motion capture by monoscopic computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam. The approach followed consists of registering a 3D articulated upper-body model to a video sequence. This involves searching iteratively for the best match between features extracted from the 3D model and from the image. A two-step registration process matches regions and then edges. The first contribution of this thesis is a method of allocating computing iterations under real-time constrain that achieves optimal robustness and accuracy. The major issue for robust 3D tracking from monocular images is the 3D/2D ambiguities that result from the lack of depth information. Particle filtering has become a popular framework for propagating multiple hypotheses between frames. As a second contribution, this thesis enhances particle filtering for 3D/2D registration under limited computation constrains with a number of heuristics, the contribution of which is demonstrated experimentally. A parameterization of the arm pose based on their end-effector is proposed to better model uncertainty in the depth direction. Finally, evaluation is accelerated by computation on GPU. In conclusion, the proposed algorithm is demonstrated to provide robust real-time 3D body tracking from a single webcam for a large variety of gestures including partial occlusions and motion in the depth direction.

File Type: pdf
File Size: 11 MB
Publication Year: 2011
Author: Gomez Jauregui, David Antonio
Supervisors: Patrick Horain
Institution: Telecom SudParis
Keywords: 3D motion capture, monocular vision, 3D/2D registration, particle filtering, real-time computer vision