Performative Statistical Parametric Speech Synthesis Applied To Interactive Designs

This dissertation introduces interactive designs in the context of statistical parametric synthesis. The objective is to develop methods and designs that enrich the Human-Computer Interaction by enabling computers (or other devices) to have more expressive and adjustable voices. First, we tackle the problem of interactive controls and present a novel method for performative HMM-based synthesis (pHTS). Second, we apply interpolation methods, initially developed for the traditional HMM-based speech synthesis system, in the interactive framework of pHTS. Third, we integrate articulatory control in our interactive approach. Fourth, we present a collection of interactive applications based on our work. Finally, we unify our research into an open source library, Mage. To our current knowledge Mage is the first system for interactive programming of HMM-based synthesis that allows realtime manipulation of all speech production levels. It has been used also in cases that are not related to speech, such as audio-visual laughter and stylistic gait synthesis and reconstruction. We realise that performative HMM-based synthesis can find applications in several domains, such as entertainment and gaming, performing arts, assistive applications, culture, education, linguistics, speech pedagogy and therapy. Nevertheless, artificial speech, laughter or motion, still lacks of naturalness. However, an important contribution of this dissertation is the engagement of the user in the synthesis and generation process. This sets a first basis for the application of crowd sourcing and gamification approaches in the HMM-based synthesis domain, which will help us not only tackle several existing problems and improve the existing technology, but also form new questions to pursuit.

File Type: pdf
File Size: 3 MB
Publication Year: 2014
Author: Astrinaki, Maria
Supervisors: Thierry Dutoit, Nicolas d ?Alessandro
Institution: University of Mons
Keywords: speech synthesis, statistical parametric synthesis, hidden Markov models, HMM-based speech synthesis, HMMs, HTS, Mage