Similar: Learning Transferable Knowledge through Embedding Spaces

Video Content Analysis by Active Learning

Advances in compression techniques, decreasing cost of storage, and high-speed transmission have facilitated the way videos are created, stored and distributed. As a consequence, videos are now being used in many applications areas. The increase in the amount of video data deployed and used in today's applications reveals not only the importance as multimedia data type, but also led to the requirement of efficient management of video data. This management paved the way for new research areas, such as indexing and retrieval of video with respect to their spatio-temporal, visual and semantic contents. This thesis presents work towards a unified framework for semi-automated video indexing and interactive retrieval. To create an efficient index, a set of representative key frames are selected which capture and encapsulate the entire video content. This is achieved by, firstly, segmenting the video into its constituent ...

Camara Chavez, Guillermo — Federal University of Minas Gerais

On Bayesian Methods for Black-Box Optimization: Efficiency, Adaptation and Reliability

Recent advances in many fields ranging from engineering to natural science, require increasingly complicated optimization tasks in the experiment design, for which the target objectives are generally in the form of black-box functions that are expensive to evaluate. In a common formulation of this problem, a designer is expected to solve the black-box optimization tasks via sequentially attempting candidate solutions and receiving feedback from the system. This thesis considers Bayesian optimization (BO) as the black-box optimization framework, and investigates the enhancements on BO from the aspects of efficiency, adaptation and reliability. Generally, BO consists of a surrogate model for providing probabilistic inference and an acquisition function which leverages the probabilistic inference for selecting the next candidate solution. Gaussian process (GP) is a prominent non-parametric surrogate model, and the quality of its inference is a critical factor on the optimality performance ...

Zhang, Yunchuan — King's College London

Joint Modeling and Learning Approaches for Hyperspectral Imaging and Changepoint Detection

In the era of artificial intelligence, there has been a growing consensus that solutions to complex science and engineering problems require novel methodologies that can integrate interpretable physics-based modeling approaches with machine learning techniques, from stochastic optimization to deep neural networks. This thesis aims to develop new methodological and applied frameworks for combining the advantages of physics-based modeling and machine learning, with special attention to two important signal processing tasks: solving inverse problems in hyperspectral imaging and detecting changepoints in time series. The first part of the thesis addresses learning priors in model-based optimization for solving inverse problems in hyperspectral imaging systems. First, we introduce a tuning-free Plug-and-Play algorithm for hyperspectral image deconvolution (HID). Specifically, we decompose the optimization problem into two iterative sub-problems, learn deep priors to solve the blind denoising sub-problem with neural networks, and estimate hyperparameters with ...

Xiuheng Wang — Université Côte d'Azur

Bayesian data fusion for distributed learning

This dissertation explores the intersection of data fusion, federated learning, and Bayesian methods, with a focus on their applications in indoor localization, GNSS, and image processing. Data fusion involves integrating data and knowledge from multiple sources. It becomes essential when data is only available in a distributed fashion or when different sensors are used to infer a quantity of interest. Data fusion typically includes raw data fusion, feature fusion, and decision fusion. In this thesis, we will concentrate on feature fusion. Distributed data fusion involves merging sensor data from different sources to estimate an unknown process. Bayesian framework is often used because it can provide an optimal and explainable feature by preserving the full distribution of the unknown given the data, called posterior, over the estimated process at each agent. This allows for easy and recursive merging of sensor data ...

Peng Wu — Northeastern University

Distributed Stochastic Optimization in Non-Differentiable and Non-Convex Environments

The first part of this dissertation considers distributed learning problems over networked agents. The general objective of distributed adaptation and learning is the solution of global, stochastic optimization problems through localized interactions and without information about the statistical properties of the data. Regularization is a useful technique to encourage or enforce structural properties on the resulting solution, such as sparsity or constraints. A substantial number of regularizers are inherently non-smooth, while many cost functions are differentiable. We propose distributed and adaptive strategies that are able to minimize aggregate sums of objectives. In doing so, we exploit the structure of the individual objectives as sums of differentiable costs and non-differentiable regularizers. The resulting algorithms are adaptive in nature and able to continuously track drifts in the problem; their recursions, however, are subject to persistent perturbations arising from the stochastic nature of ...

Vlaski, Stefan — University of California, Los Angeles

Robust Adaptive Machine Learning Algorithms for Distributed Signal Processing

Distributed networks comprising a large number of nodes, e.g., Wireless Sensor Networks, Personal Computers (PC’s), laptops, smart phones, etc., which cooperate with each other in order to reach a common goal, constitute a promising technology for several applications. Typical examples include: distributed environmental monitoring, acoustic source localization, power spectrum estimation, etc. Sophisticated cooperation mechanisms can significantly benefit the learning process, through which the nodes achieve their common objective. In this dissertation, the problem of adaptive learning in distributed networks is studied, focusing on the task of distributed estimation. A set of nodes sense information related to certain parameters and the estimation of these parameters constitutes the goal. Towards this direction, nodes exploit locally sensed measurements as well as information springing from interactions with other nodes of the network. Throughout this dissertation, the cooperation among the nodes follows the diffusion optimization ...

Chouvardas, Symeon — National and Kapodistrian University of Athens

Sequential Reasoning with Socially Caused Beliefs

Machine learning and artificial intelligence methods have achieved remarkable success, matching and even surpassing human capabilities in various complex tasks. However, many demonstrations have generally neglected a critical part of the intelligence that is prevalent in the real world, namely, the one that emerges from the collective of interconnected individuals with diverse capabilities, perspectives and experiences. To explore this fact, the current dissertation utilizes mathematical models of collaborative learning and reasoning. These models are based on the following two concepts: Bayesian inference, which is used to model how agents update their beliefs in the face of uncertain data, and graphs, which represent the communication links and information exchange among individuals. Through these models, the current work examines the effect of dynamic models on learning, as well as the implications of causal interactions among agents on their decisions. In particular, this ...

Kayaalp, Mert — EPFL

Disentanglement for improved data-driven modeling of dynamical systems

Modeling dynamical systems is a fundamental task in various scientific and engineering domains, requiring accurate predictions, robustness to varying conditions, and interpretability of the underlying mechanisms. Traditional data-driven approaches often struggle with long-term prediction accuracy, generalization to out-of-distribution (OOD) scenarios, and providing insights into the system's behavior. This thesis explores the integration of supervised disentanglement into deep learning models as a means to address these challenges. We begin by advancing the state-of-the-art in modeling wave propagation governed by the Saint-Venant equations. Utilizing U-Net architectures and purposefully designed training strategies, we develop deep learning models that significantly improve prediction accuracy. Through OOD analysis, we highlight the limitations of standard deep learning models in capturing complex spatiotemporal dynamics, demonstrating how integrating domain knowledge through architectural design and training practices can enhance model performance. We further extend our supervised disentanglement approach to high-dimensional ...

Stathi Fotiadis — Imperial College London

Deep learning for semantic description of visual human traits

The recent progress in artificial neural networks (rebranded as “deep learning”) has significantly boosted the state-of-the-art in numerous domains of computer vision offering an opportunity to approach the problems which were hardly solvable with conventional machine learning. Thus, in the frame of this PhD study, we explore how deep learning techniques can help in the analysis of one the most basic and essential semantic traits revealed by a human face, namely, gender and age. In particular, two complementary problem settings are considered: (1) gender/age prediction from given face images, and (2) synthesis and editing of human faces with the required gender/age attributes. Convolutional Neural Network (CNN) has currently become a standard model for image-based object recognition in general, and therefore, is a natural choice for addressing the first of these two problems. However, our preliminary studies have shown that the ...

Antipov, Grigory — Télécom ParisTech (Eurecom)

Audio Embeddings for Semi-Supervised Anomalous Sound Detection

Detecting anomalous sounds is a difficult task: First, audio data is very high-dimensional and anomalous signal components are relatively subtle in relation to the entire acoustic scene. Furthermore, normal and anomalous audio signals are not inherently different because defining these terms strongly depends on the application. Third, usually only normal data is available for training a system because anomalies are rare, diverse, costly to produce and in many cases unknown in advance. Such a setting is called semi-supervised anomaly detection. In domain-shifted conditions or when only very limited training data is available, all of these problems are even more severe. The goal of this thesis is to overcome these difficulties by teaching an embedding model to learn data representations suitable for semi-supervised anomalous sound detection. More specifically, an anomalous sound detection system is designed such that the resulting representations of ...

Wilkinghoff, Kevin — Rheinische Friedrich-Wilhelms-Universität Bonn

Visual ear detection and recognition in unconstrained environments

Automatic ear recognition systems have seen increased interest over recent years due to multiple desirable characteristics. Ear images used in such systems can typically be extracted from profile head shots or video footage. The acquisition procedure is contactless and non-intrusive, and it also does not depend on the cooperation of the subjects. In this regard, ear recognition technology shares similarities with other image-based biometric modalities. Another appealing property of ear biometrics is its distinctiveness. Recent studies even empirically validated existing conjectures that certain features of the ear are distinct for identical twins. This fact has significant implications for security-related applications and puts ear images on a par with epigenetic biometric modalities, such as the iris. Ear images can also supplement other biometric modalities in automatic recognition systems and provide identity cues when other information is unreliable or even unavailable. In ...

Emeršič, Žiga — University of Ljubljana, Faculty of Computer and Information Science

Development of a Framework to Enhance BVOC Imaging

Air pollution remains a major global challenge, particularly in urban areas where high pollutant concentrations negatively impact public health and contribute to climate change. Among the various pollutants, biogenic volatile organic compounds (BVOCs) play a critical role in atmospheric chemistry, influencing the formation of secondary organic aerosols and ground-level ozone, affecting air quality and climate dynamics. Accurately estimating BVOC emissions at high spatial resolution is challenging due to the limitations of satellite observations and computational models. Additionally, forecasting nitrogen dioxide (NO2) concentrations in urban environments is vital for effective air quality management, yet existing models often struggle to capture complex spatiotemporal dependencies. The thesis aims to address these challenges by proposing novel deep learning (DL) frameworks to tackle two key tasks: (i) improving the spatial resolution of BVOC emission maps through super-resolution (SR) techniques and (ii) developing a robust model ...

Giganti, Antonio — Politecnico di Milano

Robust Lung Sound and Acoustic Scene Classification

Auscultation with a stethoscope enables us to recognize pathological changes of the lung. It is a fast and inexpensive diagnosis method. However, it has several disadvantages: subjectiveness, i.e. the lung sound evaluation depends on the experience of physicians, can not provide continuous monitoring and a trained expert is required. Furthermore, the characteristics of the lung sounds are in the low frequency range, where the human hearing has limited sensitivity and is susceptible to noise artifacts. Exploiting the advances in digital recording devices, signal processing and machine learning, computational methods for the analysis of lung sounds have been a successful and effective approach. Computational lung sound analysis is beneficial for computer-supported diagnosis, digital storage and monitoring in critical care. Beside computational lung sound analysis, the recognition of acoustic contextual information is important in various applications. The motivation for recent research on ...

Truc Nguyen — SPSC - TUGraz

Model-based Techniques and Diffusion Models for Speech Dereverberation

Reverberation occurs in most of our environments and often degrades the intelligibility and quality of human speech, with an aggravated effect on hearing-impaired listeners. Meanwhile, the evolution of technologies for multimedia entertainment, communications and medical applications has led to a greater demand for improved sound quality. Therefore, many embedded devices now include a dereverberation algorithm, which aims to recover the anechoic component of speech. Dereverberation is an arduous task and an ill-posed inverse problem: even perfectly knowing the room acoustics does not guarantee to obtain a perfectly dereverberated signal. Furthermore, in most real-life cases, such knowledge is not available and therefore most dereverberation algorithms are blind, i.e. they must extract information from the reverberant speech signal only. Traditional dereverberation algorithms derive anechoic speech estimators exploiting statistical properties of speech signals, distributional assumptions and even knowledge of room acoustics when available. ...

Lemercier, Jean-Marie — University of Hamburg

Constrained Non-negative Matrix Factorization for Vocabulary Acquisition from Continuous Speech

One desideratum in designing cognitive robots is autonomous learning of communication skills, just like humans. The primary step towards this goal is vocabulary acquisition. Being different from the training procedures of the state-of-the-art automatic speech recognition (ASR) systems, vocabulary acquisition cannot rely on prior knowledge of language in the same way. Like what infants do, the acquisition process should be data-driven with multi-level abstraction and coupled with multi-modal inputs. To avoid lengthy training efforts in a word-by-word interactive learning process, a clever learning agent should be able to acquire vocabularies from continuous speech automatically. The work presented in this thesis is entitled \emph{Constrained Non-negative Matrix Factorization for Vocabulary Acquisition from Continuous Speech}. Enlightened by the extensively studied techniques in ASR, we design computational models to discover and represent vocabularies from continuous speech with little prior knowledge of the language to ...

Sun, Meng — Katholieke Universiteit Leuven

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Follow @eurasip

Learning Transferable Knowledge through Embedding Spaces (2019)