Glottal Source Estimation and Automatic Detection of Dysphonic Speakers

Among all the biomedical signals, speech is among the most complex ones since it is produced and received by humans. The extraction and the analysis of the information conveyed by this signal are the basis of many applications, including the topics discussed in this thesis: the estimation of the glottal source and the automatic detection of voice pathologies. In the first part of the thesis, after a presentation of existing methods for the estimation of the glottal source, a focus is made on the occurence of irregular glottal source estimations when the representation based on the Zeros of the Z-Transform (ZZT) is concerned. As this method is sensitive to the location of the analysis window, it is proposed to regularize the estimation by shifting the analysis window around its initial location. The best shift is found by using a dynamic ...

Dubuisson, Thomas — University of Mons


Oscillator-plus-Noise Modeling of Speech Signals

In this thesis we examine the autonomous oscillator model for synthesis of speech signals. The contributions comprise an analysis of realizations and training methods for the nonlinear function used in the oscillator model, the combination of the oscillator model with inverse filtering, both significantly increasing the number of `successfully' re-synthesized speech signals, and the introduction of a new technique suitable for the re-generation of the noise-like signal component in speech signals. Nonlinear function models are compared in a one-dimensional modeling task regarding their presupposition for adequate re-synthesis of speech signals, in particular considering stability. The considerations also comprise the structure of the nonlinear functions, with the aspect of the possible interpolation between models for different speech sounds. Both regarding stability of the oscillator and the premiss of a nonlinear function structure that may be pre-defined, RBF networks are found a ...

Rank, Erhard — Vienna University of Technology


Advances in Glottal Analysis and its Applications

From artificial voices in GPS to automatic systems of dictation, from voice-based identity verification to voice pathology detection, speech processing applications are nowadays omnipresent in our daily life. By offering solutions to companies seeking for efficiency enhancement with simultaneous cost saving, the market of speech technology is forecast to be especially promising in the next years. The present thesis deals with advances in glottal analysis in order to incorporate new techniques within speech processing applications. While current systems are usually based on information related to the vocal tract configuration, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific complex processing operations, which explains why it has been generally avoided. The main goal of this thesis is to provide new advances in glottal analysis ...

Drugman, Thomas — Universite de Mons


EEG-Biofeedback and Epilepsy: Concept, Methodology and Tools for (Neuro)therapy Planning and Objective Evaluation

Objective diagnosis and therapy evaluation are still challenging tasks for many neurological disorders. This is highly related to the diversity of cases and the variety of treatment modalities available. Especially in the case of epilepsy, which is a complex disorder not well-explained at the biochemical and physiological levels, there is the need for investigations for novel features, which can be extracted and quantified from electrophysiological signals in clinical practice. Neurotherapy is a complementary treatment applied in various disorders of the central nervous system, including epilepsy. The method is subsumed under behavioral medicine and is considered an operant conditioning in psychological terms. Although the application areas of this promising unconventional approach are rapidly increasing, the method is strongly debated, since the neurophysiological underpinnings of the process are not yet well understood. Therefore, verification of the efficacy of the treatment is one ...

Kirlangic, Mehmet Eylem — Technische Universitaet Ilmenau


Glottal-Synchronous Speech Processing

Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up ...

Thomas, Mark — Imperial College London


Realtime and Accurate Musical Control of Expression in Voice Synthesis

In the early days of speech synthesis research, understanding voice production has attracted the attention of scientists with the goal of producing intelligible speech. Later, the need to produce more natural voices led researchers to use prerecorded voice databases, containing speech units, reassembled by a concatenation algorithm. With the outgrowth of computer capacities, the length of units increased, going from diphones to non-uniform units, in the so-called unit selection framework, using a strategy referred to as 'take the best, modify the least'. Today the new challenge in voice synthesis is the production of expressive speech or singing. The mainstream solution to this problem is based on the “there is no data like more data” paradigm: emotionspecific databases are recorded and emotion-specific units are segmented. In this thesis, we propose to restart the expressive speech synthesis problem, from its original voice ...

D' Alessandro, N. — Universite de Mons


Automated quantification of preterm brain maturation using electroencephalography

Around 10 percent of all human births is premature, which means that annually about 15 million babies are born before 37 completed weeks of gestation. About one third of the admissions to the Neonatal Intensive Care Unit (NICU) consists of this patient group. Due to complications, 1 million babies die from premature delivery, and it is therefore the most important cause of neonatal death. In general, premature and immature babies have a high risk for neurological abnormalities by maturation in extra-uterine life. Even though improved health care has increased the survival changes of these neonates, they are sensitive to brain damage and consequently, neurocognitive disabilities. Nowadays, critical information about the brain development can be extracted from the electroencephalography (EEG). Clinical experts visually assess evolving EEG characteristics over both short and long periods to evaluate maturation of patients at risk and, ...

Koolen, Ninah — KU Leuven


Speech Enhancement for Disordered and Substitution Voices

This thesis presents methods to enhance the speech of patients with voice disorders or with substitution voices. The first method enhances speech of patients with laryngeal neoplasm. The enhancement enables a reduction of pitch and a strengthening of the harmonics of voiced segments as well as decreasing the perceived speaking effort. The need for reliable pitch mark determination on disordered and substitution voices led to the implementation of a state-space based algorithm. Its performance is comparable to a state-of-the art pitch detection algorithm but does not require post processing. A subsequent part of the thesis deals with alaryngeal speech, with a focus on Electro-Larynx (EL) speech. After investigating an EL speech production model, which takes into account the common source of the speech signal and the directly radiated EL (DREL) sound, a solution to suppress the direct sound is based ...

Hagmuller, Martin — Graz University of Technology


Automated detection of epileptic seizures in pediatric patients based on accelerometry and surface electromyography

Epilepsy is one of the most common neurological diseases that manifests in repetitive epileptic seizures as a result of an abnormal, synchronous activity of a large group of neurons. Depending on the affected brain regions, seizures produce various severe clinical symptoms. There is no cure for epilepsy and sometimes even medication and other therapies, like surgery, vagus nerve stimulation or ketogenic diet, do not control the number of seizures. In that case, long-term (home) monitoring and automatic seizure detection would enable the tracking of the evolution of the disease and improve objective insight in any responses to medical interventions or changes in medical treatment. Especially during the night, supervision is reduced; hence a large number of seizures is missed. In addition, an alarm should be integrated into the automated seizure detection algorithm for severe seizures in order to help the ...

Milošević, Milica — KU Leuven


Analysis, Modelling, and Simulation of an Integrated Multisensor System for Maritime Border Control

In this dissertation a notional multi-sensor system acting in a maritime border control scenario for Homeland Security (HS) is analyzed, modelled, and simulated. The functions performed by the system are the detection, tracking, identification and classification of naval targets that enter a sea region, the evaluation of their threat level and the selection of a suitable reaction to them. The emulated system is composed of two platforms carrying multiple sensors: a land based platform, located on the coast, and an air platform, moving on an elliptic trajectory in front of the coast. The land based platform is equipped with a Vessel Traffic Service (VTS) radar, an infrared camera (IR) and a station belonging to an Automatic Identification System (AIS). The air platform carries an Airborne Early Warning Radar (AEWR) that can operate on a spotlight Synthetic Aperture Radar (SAR) mode, ...

Giompapa, Sofia — Universita di Pisa


Integrating monaural and binaural cues for sound localization and segregation in reverberant environments

The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. Binaural processing, where input signals resemble those that enter the two ears, is of particular interest in the CASA field. The dominant approach to binaural segregation has been to derive spatially selective filters in order to enhance the signal in a direction of interest. As such, the problems of sound localization and sound segregation are closely tied. While spatial filtering has been widely utilized, substantial performance degradation is incurred in reverberant environments and more fundamentally, segregation cannot be performed without sufficient spatial separation between sources. This dissertation ...

Woodruff, John — The Ohio State University


Acoustic sensor network geometry calibration and applications

In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility for many practical applications. ASN based speech enhancement, source localization, and event detection can be applied for teleconferencing, camera control, automation, or assisted living. For this kind of applications, the awareness of auditory objects and their spatial positioning are key properties. In order to provide these two kinds of information, novel methods have been developed in this thesis. Information on the type of auditory objects is provided by a novel real-time sound classification method. Information on the position of human speakers is provided by a novel localization ...

Plinge, Axel — TU Dortmund University


Advanced equalization techniques for DMT-based systems

Digital subscriber line (DSL) technology is one of the fastest growing broadband internet access media. Whereas asymmetric DSL (ADSL) already offers data rates of a few megabits per second, next-generation ADSL2+ and VDSL promise even higher bit rates to support so-called triple play (high-quality video, voice and high-speed data). The use of a large bandwidth over the phone line (up to 12 MHz for VDSL) induces impairments, such as severe channel distortion, echo, narrow-band radiofrequency interference (RFI) and crosstalk from other DSL systems. DSL communication makes use of so-called discrete multitone (DMT) modulation, supplemented with advanced digital signal processing algorithms, to tackle these impairments and serve a maximum number of customers. In this thesis, we focus on channel equalization and RFI mitigation algorithms that outperform existing algorithms in terms of bit rate. DMT equalization is typically done by means of ...

Vanbleu, Koen — Katholieke Universiteit Leuven


Video Processing for Remote Respiration Monitoring

Monitoring of vital signs is a key tool in medical diagnostics to asses the onset and the evolution of several diseases. Among fundamental vital parameters, such as the hearth rate, blood pressure and body temperature, the Respiratory Rate (RR) plays an important role. For this reason, respiration needs to be carefully monitored in order to detect potential signs or events indicating possible changes of health conditions. Monitoring of the respiration is generally carried out in hospital and clinical environments by the use of expensive devices with several sensors connected to the patient's body. A new research trend, in order to reduce healthcare service costs and make monitoring of vital signs more comfortable, is the development of low-cost systems which may allow remote and contactless monitoring; in such a context, an appealing method is to rely on video processing-based solutions. In ...

Alinovi, Davide — University of Parma


Least squares support vector machines classification applied to brain tumour recognition using magnetic resonance spectroscopy

Magnetic Resonance Spectroscopy (MRS) is a technique which has evolved rapidly over the past 15 years. It has been used specifically in the context of brain tumours and has shown very encouraging correlations between brain tumour type and spectral pattern. In vivo MRS enables the quantification of metabolite concentrations non-invasively, thereby avoiding serious risks to brain damage. While Magnetic Resonance Imaging (MRI) is commonly used for identifying the location and size of brain tumours, MRS complements it with the potential to provide detailed chemical information about metabolites present in the brain tissue and enable an early detection of abnormality. However, the introduction of MRS in clinical medicine has been difficult due to problems associated with the acquisition of in vivo MRS signals from living tissues at low magnetic fields acceptable for patients. The low signal-to-noise ratio makes accurate analysis of ...

Lukas, Lukas — Katholieke Universiteit Leuven

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.