Sketching for Large-Scale Learning of Mixture Models (2017)
Compressed sensing and dimensionality reduction for unsupervised learning
This work aims at exploiting compressive sensing paradigms in order to reduce the cost of statistical learning tasks. We first provide a reminder of compressive sensing bases and describe some statistical analysis tasks using similar ideas. Then we describe a framework to perform parameter estimation on probabilistic mixture models in a case where training data is compressed to a fixed-size representation called a sketch. We formulate the estimation as a generalized inverse problem for which we propose a greedy algorithm. We experiment this framework and algorithm on an isotropic Gaussian mixture model. This proof of concept suggests the existence of theoretical recovery guarantees for sparse objects beyond the usual vector and matrix cases. We therefore study the generalization of linear inverse problems stability results on general signal models encompassing the standard cases and the sparse mixtures of probability distributions. We ...
Bourrier, Anthony — INRIA, Technicolor
Bayesian Compressed Sensing using Alpha-Stable Distributions
During the last decades, information is being gathered and processed at an explosive rate. This fact gives rise to a very important issue, that is, how to effectively and precisely describe the information content of a given source signal or an ensemble of source signals, such that it can be stored, processed or transmitted by taking into consideration the limitations and capabilities of the several digital devices. One of the fundamental principles of signal processing for decades is the Nyquist-Shannon sampling theorem, which states that the minimum number of samples needed to reconstruct a signal without error is dictated by its bandwidth. However, there are many cases in our everyday life in which sampling at the Nyquist rate results in too many data and thus, demanding an increased processing power, as well as storage requirements. A mathematical theory that emerged ...
Tzagkarakis, George — University of Crete
Due to a variety of potential barriers to sample acquisition, many of the datasets encountered in important classification applications, ranging from tumor identification to facial recognition, are characterized by small samples of high-dimensional data. In such situations, linear classifiers are popular as they have less risk of overfitting while being faster and more interpretable than non-linear classifiers. They are also easier to understand and implement for the inexperienced practitioner. In this dissertation, several gaps in the literature regarding the analysis and design of linear classifiers for high-dimensional data are addressed using tools from the field of asymptotic Random Matrix Theory (RMT) which facilitate the derivation of limits of relevant quantities or distributions, such as the probability of misclassification of a particular classifier or the asymptotic distribution of its discriminant, in the RMT regime where both the sample size and dimensionality ...
Niyazi, Lama — King Abdullah University of Science and Technology
Extended target tracking using PHD filters
The world in which we live is becoming more and more automated, exemplified by the numerous robots, or autonomous vehicles, that operate in air, on land, or in water. These robots perform a wide array of different tasks, ranging from the dangerous, such as underground mining, to the boring, such as vacuum cleaning. In common for all different robots is that they must possess a certain degree of awareness, both of themselves and of the world in which they operate. This thesis considers aspects of two research problems associated with this, more specifically the Simultaneous Localization and Mapping (SLAM) problem and the Multiple Target Tracking (MTT) problem. The SLAM problem consists of having the robot create a map of an environment and simultaneously localize itself in the same map. One way to reduce the effect of small errors that inevitably ...
Granström, Karl — Linköping University
Robust Methods for Sensing and Reconstructing Sparse Signals
Compressed sensing (CS) is a recently introduced signal acquisition framework that goes against the traditional Nyquist sampling paradigm. CS demonstrates that a sparse, or compressible, signal can be acquired using a low rate acquisition process. Since noise is always present in practical data acquisition systems, sensing and reconstruction methods are developed assuming a Gaussian (light-tailed) model for the corrupting noise. However, when the underlying signal and/or the measurements are corrupted by impulsive noise, commonly employed linear sampling operators, coupled with Gaussian-derived reconstruction algorithms, fail to recover a close approximation of the signal. This dissertation develops robust sampling and reconstruction methods for sparse signals in the presence of impulsive noise. To achieve this objective, we make use of robust statistics theory to develop appropriate methods addressing the problem of impulsive noise in CS systems. We develop a generalized Cauchy distribution (GCD) ...
Carrillo, Rafael — University of Delaware
Compressed Sensing: Novel Applications, Challenges, and Techniques
Compressed Sensing (CS) is a widely used technique for efficient signal acquisition, in which a very small number of (possibly noisy) linear measurements of an unknown signal vector are taken via multiplication with a designed ‘sensing matrix’ in an application-specific manner, and later recovered by exploiting the sparsity of the signal vector in some known orthonormal basis and some special properties of the sensing matrix which allow for such recovery. We study three new applications of CS, each of which poses a unique challenge in a different aspect of it, and propose novel techniques to solve them, advancing the field of CS. Each application involves a unique combination of realistic assumptions on the measurement noise model and the signal, and a unique set of algorithmic challenges. We frame Pooled RT-PCR Testing for COVID-19 – wherein RT-PCR (Reverse Transcription Polymerase Chain ...
Ghosh, Sabyasachi — Department of Computer Science and Engineering, Indian Institute of Technology Bombay
Video Content Analysis by Active Learning
Advances in compression techniques, decreasing cost of storage, and high-speed transmission have facilitated the way videos are created, stored and distributed. As a consequence, videos are now being used in many applications areas. The increase in the amount of video data deployed and used in today's applications reveals not only the importance as multimedia data type, but also led to the requirement of efficient management of video data. This management paved the way for new research areas, such as indexing and retrieval of video with respect to their spatio-temporal, visual and semantic contents. This thesis presents work towards a unified framework for semi-automated video indexing and interactive retrieval. To create an efficient index, a set of representative key frames are selected which capture and encapsulate the entire video content. This is achieved by, firstly, segmenting the video into its constituent ...
Camara Chavez, Guillermo — Federal University of Minas Gerais
Compressed sensing approaches to large-scale tensor decompositions
Today’s society is characterized by an abundance of data that is generated at an unprecedented velocity. However, much of this data is immediately thrown away by compression or information extraction. In a compressed sensing (CS) setting the inherent sparsity in many datasets is exploited by avoiding the acquisition of superfluous data in the first place. We combine this technique with tensors, or multiway arrays of numerical values, which are higher-order generalizations of vectors and matrices. As the number of entries scales exponentially in the order, tensor problems are often large-scale. We show that the combination of simple, low-rank tensor decompositions with CS effectively alleviates or even breaks the so-called curse of dimensionality. After discussing the larger data fusion optimization framework for coupled and constrained tensor decompositions, we investigate three categories of CS type algorithms to deal with large-scale problems. First, ...
Vervliet, Nico — KU Leuven
Variational Sparse Bayesian Learning: Centralized and Distributed Processing
In this thesis we investigate centralized and distributed variants of sparse Bayesian learning (SBL), an effective probabilistic regression method used in machine learning. Since inference in an SBL model is not tractable in closed form, approximations are needed. We focus on the variational Bayesian approximation, as opposed to others used in the literature, for three reasons: First, it is a flexible general framework for approximate Bayesian inference that estimates probability densities including point estimates as a special case. Second, it has guaranteed convergence properties. And third, it is a deterministic approximation concept that is even applicable for high dimensional problems where non-deterministic sampling methods may be prohibitive. We resolve some inconsistencies in the literature involved in other SBL approximation techniques with regard to a proper Bayesian treatment and the incorporation of a very desired property, namely scale invariance. More specifically, ...
Buchgraber, Thomas — Graz University of Technology
Sparsity-Aware Wireless Networks: Localization and Sensor Selection
Wireless networks have revolutionized nowadays world by providing real-time cost efficient service and connectivity. Even such an unprecedented level of service could not fulfill the insatiable desire of the modern world for more advanced technologies. As a result, a great deal of attention has been directed towards (mobile) wireless sensor networks (WSNs) which are comprised of considerably cheap nodes that can cooperate to perform complex tasks in a distributed fashion in extremely harsh environments. Unique features of wireless environments, added complexity owing to mobility, distributed nature of the network setup, and tight performance and energy constraints, pose a challenge for researchers to devise systems which strike a proper balance between performance and resource utilization. We study some of the fundamental challenges of wireless (sensor) networks associated with resource efficiency, scalability, and location-awareness. The pivotal point which distinguishes our studies from ...
Jamali-Rad, Hadi — TU Delft
Explicit and implicit tensor decomposition-based algorithms and applications
Various real-life data such as time series and multi-sensor recordings can be represented by vectors and matrices, which are one-way and two-way arrays of numerical values, respectively. Valuable information can be extracted from these measured data matrices by means of matrix factorizations in a broad range of applications within signal processing, data mining, and machine learning. While matrix-based methods are powerful and well-known tools for various applications, they are limited to single-mode variations, making them ill-suited to tackle multi-way data without loss of information. Higher-order tensors are a natural extension of vectors (first order) and matrices (second order), enabling us to represent multi-way arrays of numerical values, which have become ubiquitous in signal processing and data mining applications. By leveraging the powerful utitilies offered by tensor decompositions such as compression and uniqueness properties, we can extract more information from multi-way ...
Boussé, Martijn — KU Leuven
Contributions to signal analysis and processing using compressed sensing techniques
Chapter 2 contains a short introduction to the fundamentals of compressed sensing theory, which is the larger context of this thesis. We start with introducing the key concepts of sparsity and sparse representations of signals. We discuss the central problem of compressed sensing, i.e. how to adequately recover sparse signals from a small number of measurements, as well as the multiple formulations of the reconstruction problem. A large part of the chapter is devoted to some of the most important conditions necessary and/or sufficient to guarantee accurate recovery. The aim is to introduce the reader to the basic results, without the burden of detailed proofs. In addition, we also present a few of the popular reconstruction and optimization algorithms that we use throughout the thesis. Chapter 3 presents an alternative sparsity model known as analysis sparsity, that offers similar recovery ...
Cleju, Nicolae — "Gheorghe Asachi" Technical University of Iasi
Bayesian data fusion for distributed learning
This dissertation explores the intersection of data fusion, federated learning, and Bayesian methods, with a focus on their applications in indoor localization, GNSS, and image processing. Data fusion involves integrating data and knowledge from multiple sources. It becomes essential when data is only available in a distributed fashion or when different sensors are used to infer a quantity of interest. Data fusion typically includes raw data fusion, feature fusion, and decision fusion. In this thesis, we will concentrate on feature fusion. Distributed data fusion involves merging sensor data from different sources to estimate an unknown process. Bayesian framework is often used because it can provide an optimal and explainable feature by preserving the full distribution of the unknown given the data, called posterior, over the estimated process at each agent. This allows for easy and recursive merging of sensor data ...
Peng Wu — Northeastern University
A Geometric Deep Learning Approach to Sound Source Localization and Tracking
The localization and tracking of sound sources using microphone arrays is a problem that, even if it has attracted attention from the signal processing research community for decades, remains open. In recent years, deep learning models have surpassed the state-of-the-art that had been established by classic signal processing techniques, but these models still struggle with handling rooms with strong reverberations or tracking multiple sources that dynamically appear and disappear, especially when we cannot apply any criteria to classify or order them. In this thesis, we follow the ideas of the Geometric Deep Learning framework to propose new models and techniques that mean an advance of the state-of-the-art in the aforementioned scenarios. As the input of our models, we use acoustic power maps computed using the SRP-PHAT algorithm, a classic signal processing technique that allows us to estimate the acoustic energy ...
Diaz-Guerra, David — University of Zaragoza
Compressive Sensing of Cyclostationary Propeller Noise
This dissertation is the combination of three manuscripts –either published in or submitted to journals– on compressive sensing of propeller noise for detection, identification and localization of water crafts. Propeller noise, as a result of rotating blades, is broadband and radiates through water dominating underwater acoustic noise spectrum especially when cavitation develops. Propeller cavitation yields cyclostationary noise which can be modeled by amplitude modulation, i.e., the envelope-carrier product. The envelope consists of the so-called propeller tonals representing propeller characteristics which is used to identify water crafts whereas the carrier is a stationary broadband process. Sampling for propeller noise processing yields large data sizes due to Nyquist rate and multiple sensor deployment. A compressive sensing scheme is proposed for efficient sampling of second-order cyclostationary propeller noise since the spectral correlation function of the amplitude modulation model is sparse as shown in ...
Fırat, Umut — Istanbul Technical University
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.