Radial Basis Function Network Robust Learning Algorithms in Computer Vision Applications

This thesis introduces new learning algorithms for Radial Basis Function (RBF) networks. RBF networks is a feed-forward two-layer neural network used for functional approximation or pattern classification applications. The proposed training algorithms are based on robust statistics. Their theoretical performance has been assessed and compared with that of classical algorithms for training RBF networks. The applications of RBF networks described in this thesis consist of simultaneously modeling moving object segmentation and optical flow estimation in image sequences and 3-D image modeling and segmentation. A Bayesian classifier model is used for the representation of the image sequence and 3-D images. This employs an energy based description of the probability functions involved. The energy functions are represented by RBF networks whose inputs are various features drawn from the images and whose outputs are objects. The hidden units embed kernel functions. Each kernel ...

Bors, Adrian G. — Aristotle University of Thessaloniki


Causal Inference from Time Series: Methods for Discovering, Explaining, and Estimating Causal Relationships

Across various fields of engineering and science, there is great interest in studying causal relationships between time series. Distinguishing cause from effect is difficult in practice for many reasons, including limited access to data, unknown functional relationships, and unobserved confounding factors. Due to these challenges, modern causal inference requires methods that can perform robust detection and estimation, quantify uncertainty, and explain how model’s inputs contribute to its predictions. These challenges are further compounded in time series settings, where autocorrelation and temporal patterns can skew inference. This thesis introduces several contributions to the field of causal inference that aim to address these concerns. The first part of the thesis examines approaches to causal discovery and the detection and estimation of causal relationships, with a focus on time-series data. The second part of the thesis considers the explanation of causal models and ...

Butler, Kurt — Stony Brook University


Representation and Metric Learning Advances for Deep Neural Network Face and Speaker Biometric Systems

The increasing use of technological devices and biometric recognition systems in people daily lives has motivated a great deal of research interest in the development of effective and robust systems. However, there are still some challenges to be solved in these systems when Deep Neural Networks (DNNs) are employed. For this reason, this thesis proposes different approaches to address these issues. First of all, we have analyzed the effect of introducing the most widespread DNN architectures to develop systems for face and text-dependent speaker verification tasks. In this analysis, we observed that state-of-the-art DNNs established for many tasks, including face verification, did not perform efficiently for text-dependent speaker verification. Therefore, we have conducted a study to find the cause of this poor performance and we have noted that under certain circumstances this problem is due to the use of a ...

Mingote, Victoria — University of Zaragoza


Deep Learning for Distant Speech Recognition

Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. Among the other achievements, building computers that understand speech represents a crucial leap towards intelligent machines. Despite the great efforts of the past decades, however, a natural and robust human-machine speech interaction still appears to be out of reach, especially when users interact with a distant microphone in noisy and reverberant environments. The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field. This thesis addresses the latter scenario and proposes some novel techniques, architectures, and algorithms to improve the robustness of distant-talking acoustic models. We first elaborate on methodologies for realistic data contamination, with a particular emphasis on DNN training with simulated data. ...

Ravanelli, Mirco — Fondazione Bruno Kessler


Adaptive Edge-Enhanced Correlation Based Robust and Real-Time Visual Tracking Framework and Its Deployment in Machine Vision Systems

An adaptive edge-enhanced correlation based robust and real-time visual tracking framework, and two machine vision systems based on the framework are proposed. The visual tracking algorithm can track any object of interest in a video acquired from a stationary or moving camera. It can handle the real-world problems, such as noise, clutter, occlusion, uneven illumination, varying appearance, orientation, scale, and velocity of the maneuvering object, and object fading and obscuration in low contrast video at various zoom levels. The proposed machine vision systems are an active camera tracking system and a vision based system for a UGV (unmanned ground vehicle) to handle a road intersection. The core of the proposed visual tracking framework is an Edge Enhanced Back-propagation neural-network Controlled Fast Normalized Correlation (EE-BCFNC), which makes the object localization stage efficient and robust to noise, object fading, obscuration, and uneven ...

Ahmed, Javed — Electrical (Telecom.) Engineering Department, National University of Sciences and Technology, Rawalpindi, Pakistan.


Nonlinear rate control techniques for constant bit rate MPEG video coders

Digital visual communication has been increasingly adopted as an efficient new medium in a variety of different fields; multi-media computers, digital televisions, telecommunications, etc. Exchange of visual information between remote sites requires that digital video is encoded by compressing the amount of data and transmitting it through specified network connections. The compression and transmission of digital video is an amalgamation of statistical data coding processes, which aims at efficient exchange of visual information without technical barriers due to different standards, services, media, etc. It is associated with a series of different disciplines of digital signal processing, each of which can be applied independently. It includes a few different technical principles; distortion, rate theory, prediction techniques and control theory. The MPEG (Moving Picture Experts Group) video compression standard is based on this paradigm, thus, it contains a variety of different coding ...

Saw, Yoo-Sok — University Of Edinburgh


Sequential Reasoning with Socially Caused Beliefs

Machine learning and artificial intelligence methods have achieved remarkable success, matching and even surpassing human capabilities in various complex tasks. However, many demonstrations have generally neglected a critical part of the intelligence that is prevalent in the real world, namely, the one that emerges from the collective of interconnected individuals with diverse capabilities, perspectives and experiences. To explore this fact, the current dissertation utilizes mathematical models of collaborative learning and reasoning. These models are based on the following two concepts: Bayesian inference, which is used to model how agents update their beliefs in the face of uncertain data, and graphs, which represent the communication links and information exchange among individuals. Through these models, the current work examines the effect of dynamic models on learning, as well as the implications of causal interactions among agents on their decisions. In particular, this ...

Kayaalp, Mert — EPFL


Interpretable Machine Learning for Machine Listening

Recent years have witnessed a significant interest in interpretable machine learning (IML) research that develops techniques to analyse machine learning (ML) models. Understanding ML models is essential to gain trust in their predictions and to improve datasets, model architectures and training techniques. The majority of effort in IML research has been in analysing models that classify images or structured data and comparatively less work exists that analyses models for other domains. This research focuses on developing novel IML methods and on extending existing methods to understand machine listening models that analyse audio. In particular, this thesis reports the results of three studies that apply three different IML methods to analyse five singing voice detection (SVD) models that predict singing voice activity in musical audio excerpts. The first study introduces SoundLIME (SLIME), a method to generate temporal, spectral or time-frequency explanations ...

Mishra, Saumitra — Queen Mary University of London


Signal Processing and Learning over Topological Spaces

The aim of this thesis is to introduce a variety of signal processing methodologies specifically designed to model, interpret, and learn from data structured within topological spaces. These spaces are loosely characterized as a collection of points together with a neighborhood notion among points. The methodologies and tools discussed herein hold particular relevance and utility when applied to signals defined over combinatorial topological spaces, such as cell complexes, or within metric spaces that exhibit non-trivial properties, such as Riemann manifolds with non-flat metrics. One of the primary motivations behind this research is to address and surmount the constraints encountered with traditional graph-based representations when they are employed to depict intricate systems. This thesis emphasizes the necessity to account for sophisticated, multiway, and geometry-sensitive interactions that are not adequately captured by conventional graph models. The contributions of this work include but ...

Battiloro Claudio — Sapienza University of Rome


Wireless Localization via Learned Channel Features in Massive MIMO Systems

Future wireless networks will evolve to integrate communication, localization, and sensing capabilities. This evolution is driven by emerging application platforms such as digital twins, on the one hand, and advancements in wireless technologies, on the other, characterized by increased bandwidths, more antennas, and enhanced computational power. Crucial to this development is the application of artificial intelligence (AI), which is set to harness the vast amounts of available data in the sixth-generation (6G) of mobile networks and beyond. Integrating AI and machine learning (ML) algorithms, in particular, with wireless localization offers substantial opportunities to refine communication systems, improve the ability of wireless networks to locate the users precisely, enable context-aware transmission, and utilize processing and energy resources more efficiently. In this dissertation, advanced ML algorithms for enhanced wireless localization are proposed. Motivated by the capabilities of deep neural networks (DNNs) and ...

Artan Salihu — TU Wien


Good Features to Correlate for Visual Tracking

Estimating object motion is one of the key components of video processing and the first step in applications which require video representation. Visual object tracking is one way of extracting this component, and it is one of the major problems in the field of computer vision. Numerous discriminative and generative machine learning approaches have been employed to solve this problem. Recently, correlation filter based (CFB) approaches have been popular due to their computational efficiency and notable performances on benchmark datasets. The ultimate goal of CFB approaches is to find a filter (i.e., template) which can produce high correlation outputs around the actual object location and low correlation outputs around the locations that are far from the object. Nevertheless, CFB visual tracking methods suffer from many challenges, such as occlusion, abrupt appearance changes, fast motion and object deformation. The main reasons ...

Gundogdu, Erhan — Middle East Technical University


Data-Driven Radio Planning and Cellular Network Optimization

Abstract Integrating AI into wireless network design and management is essential for creating self-sustaining 6G networks. A key challenge is the development of automated network procedures with minimal human intervention, leveraging real-time monitoring data for immediate feedback. These advancements promote data-driven decision-making but pose risks related to data availability, safety, and the black-box nature of learning algorithms. This cumulative thesis proposes and evaluates novel procedures and algorithms for data- driven radio planning and cellular network optimization, addressing practical challenges in applying learning-based methods on real-world deployments. It emphasizes the utility of monitoring data and the integration of model-based and model-free methods, ensuring the scalability and safety of adaptive network procedures across diverse environments. The first part of the thesis explores the application of deep learning to radio propagation modeling in live cellular networks. The first paper presents a novel network ...

Lukas Eller — TU Wien


Deep Learning Techniques for Visual Counting

The explosion of Deep Learning (DL) added a boost to the already rapidly developing field of Computer Vision to such a point that vision-based tasks are now parts of our everyday lives. Applications such as image classification, photo stylization, or face recognition are nowadays pervasive, as evidenced by the advent of modern systems trivially integrated into mobile applications. In this thesis, we investigated and enhanced the visual counting task, which automatically estimates the number of objects in still images or video frames. Recently, due to the growing interest in it, several Convolutional Neural Network (CNN)-based solutions have been suggested by the scientific community. These artificial neural networks, inspired by the organization of the animal visual cortex, provide a way to automatically learn effective representations from raw visual data and can be successfully employed to address typical challenges characterizing this task, ...

Ciampi Luca — University of Pisa


Disentanglement for improved data-driven modeling of dynamical systems

Modeling dynamical systems is a fundamental task in various scientific and engineering domains, requiring accurate predictions, robustness to varying conditions, and interpretability of the underlying mechanisms. Traditional data-driven approaches often struggle with long-term prediction accuracy, generalization to out-of-distribution (OOD) scenarios, and providing insights into the system's behavior. This thesis explores the integration of supervised disentanglement into deep learning models as a means to address these challenges. We begin by advancing the state-of-the-art in modeling wave propagation governed by the Saint-Venant equations. Utilizing U-Net architectures and purposefully designed training strategies, we develop deep learning models that significantly improve prediction accuracy. Through OOD analysis, we highlight the limitations of standard deep learning models in capturing complex spatiotemporal dynamics, demonstrating how integrating domain knowledge through architectural design and training practices can enhance model performance. We further extend our supervised disentanglement approach to high-dimensional ...

Stathi Fotiadis — Imperial College London


Bayesian data fusion for distributed learning

This dissertation explores the intersection of data fusion, federated learning, and Bayesian methods, with a focus on their applications in indoor localization, GNSS, and image processing. Data fusion involves integrating data and knowledge from multiple sources. It becomes essential when data is only available in a distributed fashion or when different sensors are used to infer a quantity of interest. Data fusion typically includes raw data fusion, feature fusion, and decision fusion. In this thesis, we will concentrate on feature fusion. Distributed data fusion involves merging sensor data from different sources to estimate an unknown process. Bayesian framework is often used because it can provide an optimal and explainable feature by preserving the full distribution of the unknown given the data, called posterior, over the estimated process at each agent. This allows for easy and recursive merging of sensor data ...

Peng Wu — Northeastern University

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.