## Model Based Multiple Audio Sequence Alignment (2015)

Particle Filters and Markov Chains for Learning of Dynamical Systems

Sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) methods provide computational tools for systematic inference and learning in complex dynamical systems, such as nonlinear and non-Gaussian state-space models. This thesis builds upon several methodological advances within these classes of Monte Carlo methods. Particular emphasis is placed on the combination of SMC and MCMC in so called particle MCMC algorithms. These algorithms rely on SMC for generating samples from the often highly autocorrelated state-trajectory. A specific particle MCMC algorithm, referred to as particle Gibbs with ancestor sampling (PGAS), is suggested. By making use of backward sampling ideas, albeit implemented in a forward-only fashion, PGAS enjoys good mixing even when using seemingly few particles in the underlying SMC sampler. This results in a computationally competitive particle MCMC algorithm. As illustrated in this thesis, PGAS is a useful tool for both ...

Lindsten, Fredrik — Linköping University

This dissertation is concerned with the development of Markov chain Monte Carlo (MCMC) methods for the Bayesian restoration of degraded audio signals. First, the Bayesian approach to time series modelling is reviewed, then established MCMC methods are introduced. The first problem to be addressed is that of model order uncertainty. A reversible-jump sampler is proposed which can move between models of different order. It is shown that faster convergence can be achieved by exploiting the analytic structure of the time series model. This approach to model order uncertainty is applied to the problem of noise reduction using the simulation smoother. The effects of incorrect autoregressive (AR) model orders are demonstrated, and a mixed model order MCMC noise reduction scheme is developed. Nonlinear time series models are surveyed, and the advantages of linear-in- the-parameters models explained. A nonlinear AR (NAR) model, ...

Troughton, Paul Thomas — University of Cambridge

Discrete-time speech processing with application to emotion recognition

The subject of this PhD thesis is the efficient and robust processing and analysis of the audio recordings that are derived from a call center. The thesis is comprised of two parts. The first part is dedicated to dialogue/non-dialogue detection and to speaker segmentation. The systems that are developed are prerequisite for detecting (i) the audio segments that actually contain a dialogue between the system and the call center customer and (ii) the change points between the system and the customer. This way the volume of the audio recordings that need to be processed is significantly reduced, while the system is automated. To detect the presence of a dialogue several systems are developed. This is the first effort found in the international literature that the audio channel is exclusively exploited. Also, it is the first time that the speaker utterance ...

Kotti, Margarita — Aristotle University of Thessaloniki

Group-Sparse Regression - With Applications in Spectral Analysis and Audio Signal Processing

This doctorate thesis focuses on sparse regression, a statistical modeling tool for selecting valuable predictors in underdetermined linear models. By imposing different constraints on the structure of the variable vector in the regression problem, one obtains estimates which have sparse supports, i.e., where only a few of the elements in the response variable have non-zero values. The thesis collects six papers which, to a varying extent, deals with the applications, implementations, modifications, translations, and other analysis of such problems. Sparse regression is often used to approximate additive models with intricate, non-linear, non-smooth or otherwise problematic functions, by creating an underdetermined model consisting of candidate values for these functions, and linear response variables which selects among the candidates. Sparse regression is therefore a widely used tool in applications such as, e.g., image processing, audio processing, seismological and biomedical modeling, but is ...

Kronvall, Ted — Lund University

The problem of signal separation is a very broad and fundamental one. A powerful paradigm within which signal separation can be achieved is the assumption that the signals/sources are statistically independent of one another. This is known as Independent Component Analysis (ICA). In this thesis, the theoretical aspects and derivation of ICA are examined, from which disparate approaches to signal separation are drawn together in a unifying framework. This is followed by a review of signal separation techniques based on ICA. Second order statistics based output decorrelation methods are employed to try to solve the challenging problem of separating convolutively mixed signals, in the context of mainly audio source separation and the Cocktail Party Problem. Various optimisation techniques are devised to implement second order signal separation of both artificially mixed signals and real mixtures. A study of the advantages and ...

Ahmed, Alijah — University of Cambridge

Forensic Evaluation of the Evidence Using Automatic Speaker Recognition Systems

This Thesis is focused on the use of automatic speaker recognition systems for forensic identification, in what is called forensic automatic speaker recognition. More generally, forensic identification aims at individualization, defined as the certainty of distinguishing an object or person from any other in a given population. This objective is followed by the analysis of the forensic evidence, understood as the comparison between two samples of material, such as glass, blood, speech, etc. An automatic speaker recognition system can be used in order to perform such comparison between some recovered speech material of questioned origin (e.g., an incriminating wire-tapping) and some control speech material coming from a suspect (e.g., recordings acquired in police facilities). However, the evaluation of such evidence is not a trivial issue at all. In fact, the debate about the presentation of forensic evidence in a court ...

Ramos, Daniel — Universidad Autonoma de Madrid

Identification using Convexification and Recursion

System identification studies how to construct mathematical models for dynamical systems from the input and output data, which finds applications in many scenarios, such as predicting future output of the system or building model based controllers for regulating the output the system. Among many other methods, convex optimization is becoming an increasingly useful tool for solving system identification problems. The reason is that many identification problems can be formulated as, or transformed into convex optimization problems. This transformation is commonly referred to as the convexification technique. The first theme of the thesis is to understand the efficacy of the convexification idea by examining two specific examples. We first establish that a l1 norm based approach can indeed help in exploiting the sparsity information of the underlying parameter vector under certain persistent excitation assumptions. After that, we analyze how the nuclear ...

Dai, Liang — Uppsala University

Video person recognition strategies using head motion and facial appearance

In this doctoral dissertation, we principally explore the use of the temporal information available in video sequences for person and gender recognition; in particular, we focus on the analysis of head and facial motion, and their potential application as biometric identifiers. We also investigate how to exploit as much video information as possible for the automatic recognition; more precisely, we examine the possibility of integrating the head and mouth motion information with facial appearance into a multimodal biometric system, and we study the extraction of novel spatio-temporal facial features for recognition. We initially present a person recognition system that exploits the unconstrained head motion information, extracted by tracking a few facial landmarks in the image plane. In particular, we detail how each video sequence is firstly pre-processed by semiautomatically detecting the face, and then automatically tracking the facial landmarks over ...

Matta, Federico — Eurécom / Multimedia communications

The interest for the intelligent vehicle field has been increased during the last years, must probably due to an important number of road accidents. Many accidents could be avoided if a device attached to the vehicle would assist the driver with some warnings when dangerous situations are about to appear. In recent years, leading car developers have recorded significant efforts and support research works regarding the intelligent vehicle field where they propose solutions for the existing problems, especially in the vision domain. Road detection and following, pedestrian or vehicle detection, recognition and tracking, night vision, among others are examples of applications which have been developed and improved recently. Still, a lot of challenges and unsolved problems remain in the intelligent vehicle domain. Our purpose in this thesis is to design an Obstacle Recognition system for improving the road security by ...

Apatean, Anca Ioana — Institut National des Sciences Appliquées de Rouen

Accelerating Monte Carlo methods for Bayesian inference in dynamical models

Making decisions and predictions from noisy observations are two important and challenging problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ. The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo ...

Dahlin, Johan — Linköping University

Image Sequence Restoration Using Gibbs Distributions

This thesis addresses a number of issues concerned with the restoration of one type of image sequence namely archived black and white motion pictures. These are often a valuable historical record but because of the physical nature of the film they can suffer from a variety of degradations which reduce their usefulness. The main visual defects are â€˜dirt and sparkleâ€™ due to dust and dirt becoming attached to the film or abrasion removing the emulsion and â€˜line scratchesâ€™ due to the film running against foreign bodies in the camera or projector. For an image restoration algorithm to be successful it must be based on a mathematical model of the image. A number of models have been proposed and here we explore the use of a general class of model known as Markov Random Fields (MRFs) based on Gibbs distributions by ...

Morris, Robin David — University of Cambridge

Interactive Real-time Musical Systems

This thesis focuses on the development of automatic accompaniment sys- tems. We investigate previous systems and look at a range of approaches that have been attempted for the problem of beat tracking. Most beat trackers are intended for the purposes of music information retrieval where a ‘black box’ approach is tested on a wide variety of music genres. We highlight some of the difficulties facing offline beat trackers and design a new approach for the problem of real-time drum tracking, developing a system, B-Keeper, which makes reasonable assumptions on the nature of the signal and is provided with useful prior knowledge. Having developed the system with offline studio recordings, we look to test the system with human players. Existing offline evaluation methods seem less suitable for a performance system, since we also wish to evaluate the interaction between musician and ...

Robertson, Andrew — Queen Mary, University of London

Statistical Signal Processing for Data Fusion

In this dissertation we focus on statistical signal processing for Data Fusion, with a particular focus on wireless sensor networks. Six topics are studied: (i) Data Fusion for classification under model uncertainty; (ii) Decision Fusion over coherent MIMO channels; (iii) Performance analysis of Maximum Ratio Combining in MIMO decision fusion; (iv) Decision Fusion over non-coherent MIMO channels; (v) Decision Fusion for distributed classification of multiple targets; (vi) Data Fusion for inverse localization problems, with application to wideband passive sonar platform estimation. The first topic of this thesis addresses the problem of lack of knowledge of the prior distribution in classification problems that operate on small data sets that may make the application of Bayes' rule questionable. Uniform or arbitrary priors may provide classification answers that, even in simple examples, may end up contradicting our common sense about the problem. Entropic ...

Ciuonzo, Domenico — Second University of Naples

Improvements in Pose Invariance and Local Description for Gabor-based 2D Face Recognition

Automatic face recognition has attracted a lot of attention not only because of the large number of practical applications where human identification is needed but also due to the technical challenges involved in this problem: large variability in facial appearance, non-linearity of face manifolds and high dimensionality are some the most critical handicaps. In order to deal with the above mentioned challenges, there are two possible strategies: the first is to construct a “good” feature space in which the manifolds become simpler (more linear and more convex). This scheme usually comprises two levels of processing: (1) normalize images geometrically and photometrically and (2) extract features that are stable with respect to these variations (such as those based on Gabor filters). The second strategy is to use classification structures that are able to deal with non-linearities and to generalize properly. To ...

Gonzalez-Jimenez, Daniel — University of Vigo

Bayesian Signal Processing Techniques for GNSS Receivers: from multipath mitigation to positioning

This dissertation deals with the design of satellite-based navigation receivers. The term Global Navigation Satellite Systems (GNSS) refers to those navigation systems based on a constellation of satellites, which emit ranging signals useful for positioning. Although the american GPS is probably the most popular, the european contribution (Galileo) will be operative soon. Other global and regional systems exist, all with the same objective: aid user's positioning. Initially, the thesis provides the state-of-the-art in GNSS: navigation signals structure and receiver architecture. The design of a GNSS receiver consists of a number of functional blocks. From the antenna to the fi nal position calculation, the design poses challenges in many research areas. Although the Radio Frequency chain of the receiver is commented in the thesis, the main objective of the dissertation is on the signal processing algorithms applied after signal digitation. These ...

Closas, Pau — Universitat Politecnica de Catalunya

The current layout is optimized for **mobile
phones**. Page previews, thumbnails, and full abstracts
will remain hidden until the browser window grows in width.

The current layout is optimized for **tablet
devices**. Page previews and some thumbnails will remain
hidden until the browser window grows in width.