Data-Driven Anomaly Detection and Virtual Sensing for Engine Control (2024)
Towards Motion Capture with Minimal Sensing
Human motion capture is important for a wide variety of applications, e.g., biomechanical analysis, virtual reality and character animation. Current human motion capture solutions require a large number of markers/sensors to be placed on the body. In this work, it is shown that this can be reduced by using data-driven approaches. First a comparison of the use of lazy and eager learning methods for estimation of full-body movements from a minimal sensor set is done, which shows that both learning approaches lead to similar estimation accuracy. Next, improvements of the time coherency of output poses of the previously developed eager learning method are introduced by using a stacked input neural network. Results show that these deep and shallow learning approaches show comparable accuracy in estimation of full-body poses using only five inertial sensors. The developed approach is then applied to ...
Wouda, Frank — University of Twente
Disentanglement for improved data-driven modeling of dynamical systems
Modeling dynamical systems is a fundamental task in various scientific and engineering domains, requiring accurate predictions, robustness to varying conditions, and interpretability of the underlying mechanisms. Traditional data-driven approaches often struggle with long-term prediction accuracy, generalization to out-of-distribution (OOD) scenarios, and providing insights into the system's behavior. This thesis explores the integration of supervised disentanglement into deep learning models as a means to address these challenges. We begin by advancing the state-of-the-art in modeling wave propagation governed by the Saint-Venant equations. Utilizing U-Net architectures and purposefully designed training strategies, we develop deep learning models that significantly improve prediction accuracy. Through OOD analysis, we highlight the limitations of standard deep learning models in capturing complex spatiotemporal dynamics, demonstrating how integrating domain knowledge through architectural design and training practices can enhance model performance. We further extend our supervised disentanglement approach to high-dimensional ...
Stathi Fotiadis — Imperial College London
Contributions to Human Motion Modeling and Recognition using Non-intrusive Wearable Sensors
This thesis contributes to motion characterization through inertial and physiological signals captured by wearable devices and analyzed using signal processing and deep learning techniques. This research leverages the possibilities of motion analysis for three main applications: to know what physical activity a person is performing (Human Activity Recognition), to identify who is performing that motion (user identification) or know how the movement is being performed (motor anomaly detection). Most previous research has addressed human motion modeling using invasive sensors in contact with the user or intrusive sensors that modify the user’s behavior while performing an action (cameras or microphones). In this sense, wearable devices such as smartphones and smartwatches can collect motion signals from users during their daily lives in a less invasive or intrusive way. Recently, there has been an exponential increase in research focused on inertial-signal processing to ...
Gil-Martín, Manuel — Universidad Politécnica de Madrid
Acoustic Event Detection: Feature, Evaluation and Dataset Design
It takes more time to think of a silent scene, action or event than finding one that emanates sound. Not only speaking or playing music but almost everything that happens is accompanied with or results in one or more sounds mixed together. This makes acoustic event detection (AED) one of the most researched topics in audio signal processing nowadays and it will probably not see a decline anywhere in the near future. This is due to the thirst for understanding and digitally abstracting more and more events in life via the enormous amount of recorded audio through thousands of applications in our daily routine. But it is also a result of two intrinsic properties of audio: it doesn’t need a direct sight to be perceived and is less intrusive to record when compared to image or video. Many applications such ...
Mina Mounir — KU Leuven, ESAT STADIUS
Single-channel source separation for radio-frequency (RF) systems is a challenging problem relevant to key applications, including wireless communications, radar, and spectrum monitoring. This thesis addresses the challenge by focusing on data-driven approaches for source separation, leveraging datasets of sample realizations when source models are not explicitly provided. To this end, deep learning techniques are employed as function approximations for source separation, with models trained using available data. Two problem abstractions are studied as benchmarks for our proposed deep-learning approaches. Through a simplified problem involving Orthogonal Frequency Division Multiplexing (OFDM), we reveal the limitations of existing deep learning solutions and suggest modifications that account for the signal modality for improved performance. Further, we study the impact of time shifts on the formulation of an optimal estimator for cyclostationary Gaussian time series, serving as a performance lower bound for evaluating data-driven methods. ...
Lee, Cheng Feng Gary — Massachusetts Institute of Technology
Deep Learning of GNSS Signal Detection
Global Navigation Satellite Systems (GNSS) is the de facto technology for Position, Navigation, and Timing (PNT) applications when it is available. GNSS relies on one or more satellite constellations that transmit ranging signals, which a receiver can use to self-localize. Signal acquisition is a crucial step in GNSS receivers, which is typically solved by maximizing the so-called Cross Ambiguity Function (CAF) resulting from a hypothesis testing problem. The CAF is a two-dimensional function that is related to the correlation between the received signal and a local code replica for every possible delay/Doppler pair, which is then maximized for signal detection and coarse synchronization. The outcome of this statistical process decides whether the signal from a particular satellite is present or absent in the received signal, as well as provides a rough estimate of its associated code delay and Doppler frequency, ...
Borhani Darian,Parisa — Northeastern University
Modeling of Magnetic Fields and Extended Objects for Localization Applications
The level of automation in our society is ever increasing. Technologies like self-driving cars, virtual reality, and fully autonomous robots, which all were unimaginable a few decades ago, are realizable today, and will become standard consumer products in the future. These technologies depend upon autonomous localization and situation awareness where careful processing of sensory data is required. To increase efficiency, robustness and reliability, appropriate models for these data are needed. In this thesis, such models are analyzed within three different application areas, namely (1) magnetic localization, (2) extended target tracking, and (3) autonomous learning from raw pixel information. Magnetic localization is based on one or more magnetometers measuring the induced magnetic field from magnetic objects. In this thesis we present a model for determining the position and the orientation of small magnets with an accuracy of a few millimeters. This ...
Wahlström, Niklas — Linköping University
Wireless Localization via Learned Channel Features in Massive MIMO Systems
Future wireless networks will evolve to integrate communication, localization, and sensing capabilities. This evolution is driven by emerging application platforms such as digital twins, on the one hand, and advancements in wireless technologies, on the other, characterized by increased bandwidths, more antennas, and enhanced computational power. Crucial to this development is the application of artificial intelligence (AI), which is set to harness the vast amounts of available data in the sixth-generation (6G) of mobile networks and beyond. Integrating AI and machine learning (ML) algorithms, in particular, with wireless localization offers substantial opportunities to refine communication systems, improve the ability of wireless networks to locate the users precisely, enable context-aware transmission, and utilize processing and energy resources more efficiently. In this dissertation, advanced ML algorithms for enhanced wireless localization are proposed. Motivated by the capabilities of deep neural networks (DNNs) and ...
Artan Salihu — TU Wien
This dissertation develops false discovery rate (FDR) controlling machine learning algorithms for large-scale high-dimensional data. Ensuring the reproducibility of discoveries based on high-dimensional data is pivotal in numerous applications. The developed algorithms perform fast variable selection tasks in large-scale high-dimensional settings where the number of variables may be much larger than the number of samples. This includes large-scale data with up to millions of variables such as genome-wide association studies (GWAS). Theoretical finite sample FDR-control guarantees based on martingale theory have been established proving the trustworthiness of the developed methods. The practical open-source R software packages TRexSelector and tlars, which implement the proposed algorithms, have been published on the Comprehensive R Archive Network (CRAN). Extensive numerical experiments and real-world problems in biomedical and financial engineering demonstrate the performance in challenging use-cases. The first three main parts of this dissertation present ...
Machkour, Jasin — Technische Universität Darmstadt
Model-based Techniques and Diffusion Models for Speech Dereverberation
Reverberation occurs in most of our environments and often degrades the intelligibility and quality of human speech, with an aggravated effect on hearing-impaired listeners. Meanwhile, the evolution of technologies for multimedia entertainment, communications and medical applications has led to a greater demand for improved sound quality. Therefore, many embedded devices now include a dereverberation algorithm, which aims to recover the anechoic component of speech. Dereverberation is an arduous task and an ill-posed inverse problem: even perfectly knowing the room acoustics does not guarantee to obtain a perfectly dereverberated signal. Furthermore, in most real-life cases, such knowledge is not available and therefore most dereverberation algorithms are blind, i.e. they must extract information from the reverberant speech signal only. Traditional dereverberation algorithms derive anechoic speech estimators exploiting statistical properties of speech signals, distributional assumptions and even knowledge of room acoustics when available. ...
Lemercier, Jean-Marie — University of Hamburg
Predictive modelling and deep learning for quantifying human health
Machine learning and deep learning techniques have emerged as powerful tools for addressing complex challenges across diverse domains. These methodologies are powerful because they extract patterns and insights from large and complex datasets, automate decision-making processes, and continuously improve over time. They enable us to observe and quantify patterns in data that a normal human would not be able to capture, leading to deeper insights and more accurate predictions. This dissertation presents two research papers that leverage these methodologies to tackle distinct yet interconnected problems in neuroimaging and computer vision for the quantification of human health. The first investigation, "Age prediction using resting-state functional MRI," addresses the challenge of understanding brain aging. By employing the Least Absolute Shrinkage and Selection Operator (LASSO) on resting-state functional MRI (rsfMRI) data, we identify the most predictive correlations related to brain age. Our study, ...
Chang Jose — National Cheng Kung University
Mixed structural models for 3D audio in virtual environments
In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...
Geronazzo, Michele — University of Padova
In natural listening environments, speech signals are easily distorted by variousacoustic interference, which reduces the speech quality and intelligibility of human listening; meanwhile, it makes difficult for many speech-related applications, such as automatic speech recognition (ASR). Thus, many speech enhancement (SE) algorithms have been developed in the past decades. However, most current SE algorithms are difficult to capture underlying speech information (e.g., phoneme) in the SE process. This causes it to be challenging to know what specific information is lost or interfered with in the SE process, which limits the application of enhanced speech. For instance, some SE algorithms aimed to improve human listening usually damage the ASR system. The objective of this dissertation is to develop SE algorithms that have the potential to capture various underlying speech representations (information) and improve the quality and intelligibility of noisy speech. This ...
Xiang, Yang — Aalborg University, Capturi A/S
Voice biometric system security: Design and analysis of countermeasures for replay attacks
Voice biometric systems use automatic speaker verification (ASV) technology for user authentication. Even if it is among the most convenient means of biometric authentication, the robustness and security of ASV in the face of spoofing attacks (or presentation attacks) is of growing concern and is now well acknowledged by the research community. A spoofing attack involves illegitimate access to personal data of a targeted user. Replay is among the simplest attacks to mount - yet difficult to detect reliably and is the focus of this thesis. This research focuses on the analysis and design of existing and novel countermeasures for replay attack detection in ASV, organised in two major parts. The first part of the thesis investigates existing methods for spoofing detection from several perspectives. I first study the generalisability of hand-crafted features for replay detection that show promising results ...
Bhusan Chettri — Queen Mary University of London
Deep Learning for Audio Effects Modeling
Audio effects modeling is the process of emulating an audio effect unit and seeks to recreate the sound, behaviour and main perceptual features of an analog reference device. Audio effect units are analog or digital signal processing systems that transform certain characteristics of the sound source. These transformations can be linear or nonlinear, time-invariant or time-varying and with short-term and long-term memory. Most typical audio effect transformations are based on dynamics, such as compression; tone such as distortion; frequency such as equalization; and time such as artificial reverberation or modulation based audio effects. The digital simulation of these audio processors is normally done by designing mathematical models of these systems. This is often difficult because it seeks to accurately model all components within the effect unit, which usually contains mechanical elements together with nonlinear and time-varying analog electronics. Most existing ...
Martínez Ramírez, Marco A — Queen Mary University of London
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.