Computational models of expressive gesture in multimedia systems
This thesis focuses on the development of paradigms and techniques for the design and implementation of multimodal interactive systems, mainly for performing arts applications. The work addresses research issues in the fields of human-computer interaction, multimedia systems, and sound and music computing. The thesis is divided into two parts. In the first one, after a short review of the state-of-the-art, the focus moves on the definition of environments in which novel forms of technology-integrated artistic performances can take place. These are distributed active mixed reality environments in which information at different layers of abstraction is conveyed mainly non-verbally through expressive gestures. Expressive gesture is therefore defined and the internal structure of a virtual observer able to process it (and inhabiting the proposed environments) is described in a multimodal perspective. The definition of the structure of the environments, of the virtual and mixed subjects inhabiting them, and the techniques for expressive gesture processing are a source for requirements, a paradigm for design and development, and the basic bricks for implementing the interactive systems this work addresses. The second part of the thesis introduces an implementation of a virtual observer, i.e., a virtual subject observing human full-body movement, extracting expressive features from it, and attempting to classify expressive gestures according to their emotional content. This part introduces techniques for real-time extraction and classification of expressive features from video, audio, and sensors signals, and applies them to a concrete example (an experiment on dance performance). The work in this thesis was carried out at DIST-InfoMus Lab, University of Genova, Italy, in the framework of the EU-IST Project MEGA (2000-2003, Multisensory Expressive Gesture Applications, FP5, IST-1999-20410).
