Acoustic Event Detection: Feature, Evaluation and Dataset Design

It takes more time to think of a silent scene, action or event than finding one that emanates sound. Not only speaking or playing music but almost everything that happens is accompanied with or results in one or more sounds mixed together. This makes acoustic event detection (AED) one of the most researched topics in audio signal processing nowadays and it will probably not see a decline anywhere in the near future. This is due to the thirst for understanding and digitally abstracting more and more events in life via the enormous amount of recorded audio through thousands of applications in our daily routine. But it is also a result of two intrinsic properties of audio: it doesn’t need a direct sight to be perceived and is less intrusive to record when compared to image or video. Many applications such ...

Mina Mounir — KU Leuven, ESAT STADIUS


Statistical signal processing of spectrometric data: study of the pileup correction for energy spectra applied to Gamma spectrometry

The main objective of $\gamma$ spectrometry is to characterize the radioactive elements of an unknown source by studying the energy of the emitted $\gamma$ photons. When a photon interacts with a detector, its photonic energy is converted into an electrical pulse, whose integral energy is measured. The histogram obtained by collecting the energies can be used to identify radionucleides and measure their activity. However, at high counting rates, perturbations which are due to the stochastic aspect of the temporal signal can cripple the identification of the radioactive elements. More specifically, since the detector has a finite resolution, close arrival times of photons which can be modeled as an homogeneous Poisson process cause pileups of individual pulses. This phenomenon distorts energy spectra by introducing multiple fake spikes and prolonging artificially the Compton continuum, which can mask spikes of low intensity. The ...

Trigano, Thomas — Télécom Paris Tech


Novel Signal Processing Techniques For The Exploitation Of Thermal Hyperspectral Data

THIS doctoral thesis attemps to propose a novel signal processing chain, aimed to exploit data acquired by long wave infrared (LWIR) hyperspectral sensors. In the LWIR, infrared radiation from an object is directly related to its temperature, i.e. hotter the surface, higher the emitted thermal energy. Hyperspectral sensors capture the radiated energy from the objects (target) in a large number of consecutive spectral bands within the LWIR, e.g. with the aid of a prism, in order to estimate the spectrum(spectral emissivity) and the temperature of the surface material. In this framework, two main challenging tasks affect the development and the deployment of thermal hyperspectral sensors: - atmospheric correction: the process of estimate and compensate the thermal radiation produced by the atmosphere, that affects the thermal radiation procuded by the target. This process is made more complicated by the complex combination ...

Moscadelli, Matteo — University of Pisa


Novel Methods in H.264/AVC (Inter Prediction, Data Hiding, Bit Rate Transcoding)

H.264 Advanced Video Coding has become the dominant video coding standard in the market, within a few years after the first version of the standard was completed by the ISO/IEC MPEG and the ITU-T VCEG groups in May 2003. That happened mainly due to the great coding efficiency of H.264. Compared to MPEG-2, the previous dominant standard, the H.264 compression ratio is about twice as higher for the same video quality. That makes H.264 ideal for a numerous of applications, such as video broadcasting, video streaming and video conferencing. However, the H.264 efficiency is achieved at the expense of the codec¢s complexity. H.264 complexity is about four times that of MPEG-2. As a consequence, many video coding issues, which have been addressed in previous standards, need to be re-considered. For example the H.264 encoding of a video in real time ...

Kapotas, Spyridon — Hellenic Open University


Three dimensional shape modeling: segmentation, reconstruction and registration

Accounting for uncertainty in three-dimensional (3D) shapes is important in a large number of scientific and engineering areas, such as biometrics, biomedical imaging, and data mining. It is well known that 3D polar shaped objects can be represented by Fourier descriptors such as spherical harmonics and double Fourier series. However, the statistics of these spectral shape models have not been widely explored. This thesis studies several areas involved in 3D shape modeling, including random field models for statistical shape modeling, optimal shape filtering, parametric active contours for object segmentation and surface reconstruction. It also investigates multi-modal image registration with respect to tumor activity quantification. Spherical harmonic expansions over the unit sphere not only provide a low dimensional polarimetric parameterization of stochastic shape, but also correspond to the Karhunen-Lo´eve (K-L) expansion of any isotropic random field on the unit sphere. Spherical ...

Li, Jia — University of Michigan


Advanced Signal Processing Concepts for Multi-Dimensional Communication Systems

The widespread use of mobile internet and smart applications has led to an explosive growth in mobile data traffic. With the rise of smart homes, smart buildings, and smart cities, this demand is ever growing since future communication systems will require the integration of multiple networks serving diverse sectors, domains and applications, such as multimedia, virtual or augmented reality, machine-to-machine (M2M) communication / the Internet of things (IoT), automotive applications, and many more. Therefore, in the future, the communication systems will not only be required to provide Gbps wireless connectivity but also fulfill other requirements such as low latency and massive machine type connectivity while ensuring the quality of service. Without significant technological advances to increase the system capacity, the existing telecommunications infrastructure will be unable to support these multi-dimensional requirements. This poses an important demand for suitable waveforms with ...

Cheema, Sher Ali — Technische Universität Ilmenau


Nonlinear processing of non-Gaussian stochastic and chaotic deterministic time series

It is often assumed that interference or noise signals are Gaussian stochastic processes. Gaussian noise models are appealing as they usually result in noise suppression algorithms that are simple: i.e. linear and closed form. However, such linear techniques may be sub-optimal when the noise process is either a non-Gaussian stochastic process or a chaotic deterministic process. In the event of encountering such noise processes, improvements in noise suppression, relative to the performance of linear methods, may be achievable using nonlinear signal processing techniques. The application of interest for this thesis is maritime surveillance radar, where the main source of interference, termed sea clutter, is widely accepted to be a non-Gaussian stochastic process at high resolutions and/or at low grazing angles. However, evidence has been presented during the last decade which suggests that sea clutter may be better modelled as a ...

Cowper, Mark — University Of Edinburgh


Progressive visualization of incomplete sonar-data sets: from sea-bottom interpolation and segmentation to geometry extraction

This thesis describes a visualization pipeline for sonar profiling data that show reflections of multiple sediments in the sea bottom and that cover huge survey areas with many gaps. Visualizing such data is not trivial, because they may be noisy and because data sets may be very large. The developed techniques are: (1) Quadtree interpolation for estimating new sediment reflections, at all gaps in the longitude-latitude plane. The quadtree is used for guiding the 3D interpolation process: gaps become small at low spatial resolutions, where they can be filled by interpolating between available reflections. In the interpolation, the reflection data are cross correlated in order to construct continuity of multiple, sloping reflections. (2) Segmentation and boundary refinement in an octree in order to detect sediments in the sonar data. In the refinement, coarse boundaries are reclassified by filtering the data ...

Loke, Robert Edward — Delft University of Technology


Interpretable Machine Learning for Machine Listening

Recent years have witnessed a significant interest in interpretable machine learning (IML) research that develops techniques to analyse machine learning (ML) models. Understanding ML models is essential to gain trust in their predictions and to improve datasets, model architectures and training techniques. The majority of effort in IML research has been in analysing models that classify images or structured data and comparatively less work exists that analyses models for other domains. This research focuses on developing novel IML methods and on extending existing methods to understand machine listening models that analyse audio. In particular, this thesis reports the results of three studies that apply three different IML methods to analyse five singing voice detection (SVD) models that predict singing voice activity in musical audio excerpts. The first study introduces SoundLIME (SLIME), a method to generate temporal, spectral or time-frequency explanations ...

Mishra, Saumitra — Queen Mary University of London


On some aspects of inverse problems in image processing

This work is concerned with two image-processing problems, image deconvolution with incomplete observations and data fusion of spectral images, and with some of the algorithms that are used to solve these and related problems. In image-deconvolution problems, the diagonalization of the blurring operator by means of the discrete Fourier transform usually yields very large speedups. When there are incomplete observations (e.g., in the case of unknown boundaries), standard deconvolution techniques normally involve non-diagonalizable operators, resulting in rather slow methods, or, otherwise, use inexact convolution models, resulting in the occurrence of artifacts in the enhanced images. We propose a new deconvolution framework for images with incomplete observations that allows one to work with diagonalizable convolution operators, and therefore is very fast. The framework is also an efficient, high-quality alternative to existing methods of dealing with the image boundaries, such as edge ...

Simões, Miguel — Universidade de Lisboa, Instituto Superior Técnico & Université Grenoble Alpes


Signal processing of FMCW Synthetic Aperture Radar data

In the field of airborne earth observation there is special attention to compact, cost effective, high resolution imaging sensors. Such sensors are foreseen to play an important role in small-scale remote sensing applications, such as the monitoring of dikes, watercourses, or highways. Furthermore, such sensors are of military interest; reconnaissance tasks could be performed with small unmanned aerial vehicles (UAVs), reducing in this way the risk for one's own troops. In order to be operated from small, even unmanned, aircrafts, such systems must consume little power and be small enough to fulfill the usually strict payload requirements. Moreover, to be of interest for the civil market, cost effectiveness is mandatory. Frequency Modulated Continuous Wave (FMCW) radar systems are generally compact and relatively cheap to purchase and to exploit. They consume little power and, due to the fact that they are ...

Meta, Adriano — Delft University of Technology


Contributions to Human Motion Modeling and Recognition using Non-intrusive Wearable Sensors

This thesis contributes to motion characterization through inertial and physiological signals captured by wearable devices and analyzed using signal processing and deep learning techniques. This research leverages the possibilities of motion analysis for three main applications: to know what physical activity a person is performing (Human Activity Recognition), to identify who is performing that motion (user identification) or know how the movement is being performed (motor anomaly detection). Most previous research has addressed human motion modeling using invasive sensors in contact with the user or intrusive sensors that modify the user’s behavior while performing an action (cameras or microphones). In this sense, wearable devices such as smartphones and smartwatches can collect motion signals from users during their daily lives in a less invasive or intrusive way. Recently, there has been an exponential increase in research focused on inertial-signal processing to ...

Gil-Martín, Manuel — Universidad Politécnica de Madrid


Non-rigid Registration-based Data-driven 3D Facial Action Unit Detection

Automated analysis of facial expressions has been an active area of study due to its potential applications not only for intelligent human-computer interfaces but also for human facial behavior research. To advance automatic expression analysis, this thesis proposes and empirically proves two hypotheses: (i) 3D face data is a better data modality than conventional 2D camera images, not only for being much less disturbed by illumination and head pose effects but also for capturing true facial surface information. (ii) It is possible to perform detailed face registration without resorting to any face modeling. This means that data-driven methods in automatic expression analysis can compensate for the confounding effects like pose and physiognomy differences, and can process facial features more effectively, without suffering the drawbacks of model-driven analysis. Our study is based upon Facial Action Coding System (FACS) as this paradigm ...

Savran, Arman — Bogazici University


Selected Topics in Inertial and Visual Sensor Fusion: Calibration, Observability Analysis and Applications

Recent improvements in the development of inertial and visual sensors allow building small, lightweight, and cheap motion capture systems, which are becoming a standard feature of smartphones and personal digital assistants. This dissertation describes developments of new motion sensing strategies using the inertial and inertial-visual sensors. The thesis contributions are presented in two parts. The first part focuses mainly on the use of inertial measurement units. First, the problem of sensor calibration is addressed and a low-cost and accurate method to calibrate the accelerometer cluster of this unit is proposed. The method is based on the maximum likelihood estimation framework, which results in a minimum variance unbiased estimator.Then using the inertial measurement unit, a probabilistic user-independent method is proposed for pedestrian activity classification and gait analysis.The work targets two groups of applications including human activity classificationand joint human activity and ...

Panahandeh Ghazaleh — KTH Royal Institute of Technology


Distributed Detection and Localization

This thesis delves into the detection and localization aspects of distributed Wireless Sensor Networks (WSNs). Specifically, the research concentrates on WSNs in which sensors autonomously carry out detection tasks and transmit their decisions to a fusion center (FC). The FC’s role is to make a comprehensive decision about the presence of a specific event of interest and estimate its potential location. Given its broad significance, the thesis specializes in applying WSNs for industrial monitoring, particularly in the process and energy industry. Three distinct approaches are explored in this thesis: (i) per-sample/batch detection, (ii) quickest detection, and (iii) sequential detection. Each framework proposes a set of detection and associated localization rules. A primary objective of this work is to develop detection and localization strategies that leverage existing information about the monitored environment, bridging the gap between monitoring systems and the knowledge ...

Gianluca Tabella — Norwegian University of Science and Technology

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.