Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg


Spatio-temporal characterization of the surface electrocardiogram for catheter ablation outcome prediction in persistent atrial fibrillation

Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia encountered in clinical practice, and one of the main causes of ictus and strokes. Despite the advances in the comprehension of its mechanisms, its thorough characterization and the quantification of its effects on the human heart are still an open issue. In particular, the choice of the most appropriate therapy is frequently a hard task. Radiofrequency catheter ablation (CA) is becoming one of the most popular solutions for the treatment of the disease. Yet, very little is known about its impact on heart substrate during AF, thus leading to an inaccurate selection of positive responders to therapy and a low success rate; hence, the need for advanced signal processing tools able to quantify AF impact on heart substrate and assess the effectiveness of the CA therapy in an objective and ...

Marianna Meo — Université Nice Sophia Antipolis


Spatio-Temporal Speech Enhancement in Adverse Acoustic Conditions

Never before has speech been captured as often by electronic devices equipped with one or multiple microphones, serving a variety of applications. It is the key aspect in digital telephony, hearing devices, and voice-driven human-to-machine interaction. When speech is recorded, the microphones also capture a variety of further, undesired sound components due to adverse acoustic conditions. Interfering speech, background noise and reverberation, i.e. the persistence of sound in a room after excitation caused by a multitude of reflections on the room enclosure, are detrimental to the quality and intelligibility of target speech as well as the performance of automatic speech recognition. Hence, speech enhancement aiming at estimating the early target-speech component, which contains the direct component and early reflections, is crucial to nearly all speech-related applications presently available. In this thesis, we compare, propose and evaluate existing and novel approaches ...

Dietzen, Thomas — KU Leuven


Super-Resolution Image Reconstruction Using Non-Linear Filtering Techniques

Super-resolution (SR) is a filtering technique that combines a sequence of under-sampled and degraded low-resolution images to produce an image at a higher resolution. The reconstruction takes advantage of the additional spatio-temporal data available in the sequence of images portraying the same scene. The fundamental problem addressed in super-resolution is a typical example of an inverse problem, wherein multiple low-resolution (LR)images are used to solve for the original high-resolution (HR) image. Super-resolution has already proved useful in many practical cases where multiple frames of the same scene can be obtained, including medical applications, satellite imaging and astronomical observatories. The application of super resolution filtering in consumer cameras and mobile devices shall be possible in the future, especially that the computational and memory resources in these devices are increasing all the time. For that goal, several research problems need to be ...

Trimeche, Mejdi — Tampere University of Technology


Sparsity in Linear Predictive Coding of Speech

This thesis deals with developing improved modeling methods for speech and audio processing based on the recent developments in sparse signal representation. In particular, this work is motivated by the need to address some of the limitations of the well-known linear prediction (LP) based all-pole models currently applied in many modern speech and audio processing systems. In the first part of this thesis, we introduce \emph{Sparse Linear Prediction}, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance one, with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter is model as an impulse train. Introducing sparsity in the LP framework, will also bring to develop the ...

Giacobello, Daniele — Aalborg University


Microphone arrays for imaging of aerospace noise sources

With the continuous growth in demand for air traffic and wind turbines, the noise emissions they generate are becoming an increasingly important issue. To reduce their noise levels, it is essential to obtain accurate information about all the sound sources present. Phased microphone arrays and acoustic imaging methods allow for the estimation of the location and strength of sound sources. Experiments with these devices are one of the main approaches in the current research in aeroacoustics, along with computational simulations or noise prediction models. This thesis presents a detailed literature review on the most common aerospace noise sources, challenges in aeroacoustic measurements, and the acoustic imaging methods typically used to overcome them. Practical recommendations are provided for selecting the appropriate imaging technique depending on the type of experiment. New integration techniques for distributed sound sources, such as leading– or trailing–edge ...

Merino-Martinez, Roberto — Delft University of Technology


Distributed Processing Techniques for Parameter Estimation and Efficient Data Gathering in Wireless Communication and Sensor Networks

This dissertation deals with the distributed processing techniques for parameter estimation and efficient data-gathering in wireless communication and sensor networks. The estimation problem consists in inferring a set of parameters from temporal and spatial noisy observations collected by different nodes that monitor an area or field. The objective is to derive an estimate that is as accurate as the one that would be obtained if each node had access to the information across the entire network. With the aim of enabling an energy aware and low-complexity distributed implementation of the estimation task, several useful optimization techniques that generally yield linear estimators were derived in the literature. Up to now, most of the works considered that the nodes are interested in estimating the same vector of global parameters. This scenario can be viewed as a special case of a more general ...

Bogdanovic, Nikola — University of Patras


A flexible scalable video coding framework with adaptive spatio-temporal decompositions

The work presented in this thesis covers topics that extend the scalability functionalities in video coding and improve the compression performance. Two main novel approaches are presented, each targeting a different part of the scalable video coding (SVC) architecture: motion adaptive wavelet transform based on the wavelet transform in lifting implementation, and a design of a flexible framework for generalised spatio-temporal decomposition. Motion adaptive wavelet transform is based on the newly introduced concept of connectivity-map. The connectivity-map describes the underlying irregular structure of regularly sampled data. To enable a scalable representation of the connectivity-map, the corresponding analysis and synthesis operations have been derived. These are then employed to define a joint wavelet connectivity-map decomposition that serves as an adaptive alternative to the conventional wavelet decomposition. To demonstrate its applicability, the presented decomposition scheme is used in the proposed SVC framework, ...

Sprljan, Nikola — Queen Mary University of London


Video person recognition strategies using head motion and facial appearance

In this doctoral dissertation, we principally explore the use of the temporal information available in video sequences for person and gender recognition; in particular, we focus on the analysis of head and facial motion, and their potential application as biometric identifiers. We also investigate how to exploit as much video information as possible for the automatic recognition; more precisely, we examine the possibility of integrating the head and mouth motion information with facial appearance into a multimodal biometric system, and we study the extraction of novel spatio-temporal facial features for recognition. We initially present a person recognition system that exploits the unconstrained head motion information, extracted by tracking a few facial landmarks in the image plane. In particular, we detail how each video sequence is firstly pre-processed by semiautomatically detecting the face, and then automatically tracking the facial landmarks over ...

Matta, Federico — Eurécom / Multimedia communications


Feedback-Channel and Adaptive MIMO Coded-Modulations

When the transmitter of a communication system disposes of some Channel State Information (CSI), it is possible to design linear precoders that optimally allocate the power inducing high gains either in terms of capacity or in terms of reliable communications. In practical scenarios, this channel knowledge is not perfect and thus the transmitted signal suffers from the mismatch between the CSI at the transmitter and the real channel. In that context, this thesis deals with two different, but related, topics: the design of a feasible transmitter channel tracker for time varying channels, and the design of optimal linear precoders robust to imperfect channel estimates. The first part of the thesis proposes the design of a channel tracker that provides an accurate CSI at the transmitter by means of a low capacity feedback link. Historically, those schemes have been criticized because ...

Rey, Francesc — Universitat Politecnica de Catalunya


Polynomial Predictive Filters: Implementation and Applications

In this thesis, smoothness of sampled real-world signals is exploited through the application of polynomial predictive filters. The principal reason for employing the polynomial signal model is principally twofold: firstly, assuming that the sampling rate is adequate, all real-world signals exhibit piecewise polynomial-like behavior, and secondly, polynomial-based signal processing is computationally efficient. By definition, polynomial predictive filters provide estimates of future values of polynomial-like signals. Thus, the potential applications of this research include a vast number of different delay sensitive operations on measurements like temperature, position, velocity, or power, especially in control engineering field. The polynomial-based predictive signal processing is a well-known technique, but polynomial-predictive filters have had severe drawbacks, which have hindered their application; their white noise attenuation is generally low, or they exhibit considerable passband gain peaks, rendering them unattractive for most applications. It has been possible to ...

Tanskanen, Jarno M. A. — Helsinki University of Technology


Solving inverse problems in room acoustics using physical models, sparse regularization and numerical optimization

Reverberation consists of a complex acoustic phenomenon that occurs inside rooms. Many audio signal processing methods, addressing source localization, signal enhancement and other tasks, often assume absence of reverberation. Consequently, reverberant environments are considered challenging as state-ofthe-art methods can perform poorly. The acoustics of a room can be described using a variety of mathematical models, among which, physical models are the most complete and accurate. The use of physical models in audio signal processing methods is often non-trivial since it can lead to ill-posed inverse problems. These inverse problems require proper regularization to achieve meaningful results and involve the solution of computationally intensive large-scale optimization problems. Recently, however, sparse regularization has been applied successfully to inverse problems arising in different scientific areas. The increased computational power of modern computers and the development of new efficient optimization algorithms makes it possible ...

Antonello, Niccolò — KU Leuven


Speech derereverberation in noisy environments using time-frequency domain signal models

Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...

Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg


Digital Processing Based Solutions for Life Science Engineering Recognition Problems

The field of Life Science Engineering (LSE) is rapidly expanding and predicted to grow strongly in the next decades. It covers areas of food and medical research, plant and pests’ research, and environmental research. In each research area, engineers try to find equations that model a certain life science problem. Once found, they research different numerical techniques to solve for the unknown variables of these equations. Afterwards, solution improvement is examined by adopting more accurate conventional techniques, or developing novel algorithms. In particular, signal and image processing techniques are widely used to solve those LSE problems require pattern recognition. However, due to the continuous evolution of the life science problems and their natures, these solution techniques can not cover all aspects, and therefore demanding further enhancement and improvement. The thesis presents numerical algorithms of digital signal and image processing to ...

Hussein, Walid — Technische Universität München


Bayesian State-Space Modelling of Spatio-Temporal Non-Gaussian Radar Returns

Radar backscatter from an ocean surface is commonly referred to as sea clutter. Any radar backscatter not due to the scattering from an ocean surface constitutes a potential target. This thesis is concerned with the study of target detection techniques in the presence of high resolution sea clutter. In this dissertation, the high resolution sea clutter is treated as a compound process, where a fast oscillating speckle component is modulated in power by a slowly varying modulating component. While the short term temporal correlations of the clutter are associated with the speckle, the spatial correlations are largely associated with the modulating component. Due to the disparate statistical and correlation properties of the two components, a piecemeal approach is adopted throughout this thesis, whereby the spatial and the temporal correlations of high resolution sea clutter are treated independently. As an extension ...

Noga, Jacek Leszek — University of Cambridge

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.