Acoustic Event Detection: Feature, Evaluation and Dataset Design

It takes more time to think of a silent scene, action or event than finding one that emanates sound. Not only speaking or playing music but almost everything that happens is accompanied with or results in one or more sounds mixed together. This makes acoustic event detection (AED) one of the most researched topics in audio signal processing nowadays and it will probably not see a decline anywhere in the near future. This is due to the thirst for understanding and digitally abstracting more and more events in life via the enormous amount of recorded audio through thousands of applications in our daily routine. But it is also a result of two intrinsic properties of audio: it doesn’t need a direct sight to be perceived and is less intrusive to record when compared to image or video. Many applications such ...

Mina Mounir — KU Leuven, ESAT STADIUS


Statistical signal processing of spectrometric data: study of the pileup correction for energy spectra applied to Gamma spectrometry

The main objective of $\gamma$ spectrometry is to characterize the radioactive elements of an unknown source by studying the energy of the emitted $\gamma$ photons. When a photon interacts with a detector, its photonic energy is converted into an electrical pulse, whose integral energy is measured. The histogram obtained by collecting the energies can be used to identify radionucleides and measure their activity. However, at high counting rates, perturbations which are due to the stochastic aspect of the temporal signal can cripple the identification of the radioactive elements. More specifically, since the detector has a finite resolution, close arrival times of photons which can be modeled as an homogeneous Poisson process cause pileups of individual pulses. This phenomenon distorts energy spectra by introducing multiple fake spikes and prolonging artificially the Compton continuum, which can mask spikes of low intensity. The ...

Trigano, Thomas — Télécom Paris Tech


Novel Signal Processing Techniques For The Exploitation Of Thermal Hyperspectral Data

THIS doctoral thesis attemps to propose a novel signal processing chain, aimed to exploit data acquired by long wave infrared (LWIR) hyperspectral sensors. In the LWIR, infrared radiation from an object is directly related to its temperature, i.e. hotter the surface, higher the emitted thermal energy. Hyperspectral sensors capture the radiated energy from the objects (target) in a large number of consecutive spectral bands within the LWIR, e.g. with the aid of a prism, in order to estimate the spectrum(spectral emissivity) and the temperature of the surface material. In this framework, two main challenging tasks affect the development and the deployment of thermal hyperspectral sensors: - atmospheric correction: the process of estimate and compensate the thermal radiation produced by the atmosphere, that affects the thermal radiation procuded by the target. This process is made more complicated by the complex combination ...

Moscadelli, Matteo — University of Pisa


Galileo Broadcast Ephemeris and Clock Errors, and Observed Fault Probabilities for ARAIM

The characterization of Clock and Ephemeris error of the Global Navigation Satellite Systems is a key element to validate the assumptions for the integrity analysis of GNSS Safety of Life (SoL) applications. Specifically, the performance metrics of SoL applications require the characterization of the nominal User Range Errors (UREs) as well as the knowledge of the probability of a satellite, Psat or a constellation fault, Pconst, i.e. when one or more satellites are not in the nominal mode. We will focus on Advanced Autonomous Integrity Monitoring (ARAIM). The present dissertation carries-out an end-to-end characterization and analysis of Galileo and GPS satellites for ARAIM. It involves two main targets. First, the characterization of Galileo and GPS broadcast ephemeris and clock errors, to determine the fault probabilities Psat and Pconst, and the determination on an upper bound of the nominal satellite ranging ...

Alonso Alonso, María Teresa — Universitat politecnica de Catalunya, Barcelona Tech


Monitoring Infants by Automatic Video Processing

This work has, as its objective, the development of non-invasive and low-cost systems for monitoring and automatic diagnosing specific neonatal diseases by means of the analysis of suitable video signals. We focus on monitoring infants potentially at risk of diseases characterized by the presence or absence of rhythmic movements of one or more body parts. Seizures and respiratory diseases are specifically considered, but the approach is general. Seizures are defined as sudden neurological and behavioural alterations. They are age-dependent phenomena and the most common sign of central nervous system dysfunction. Neonatal seizures have onset within the 28th day of life in newborns at term and within the 44th week of conceptional age in preterm infants. Their main causes are hypoxic-ischaemic encephalopathy, intracranial haemorrhage, and sepsis. Studies indicate an incidence rate of neonatal seizures of 2‰ live births, 11‰ for preterm ...

Cattani Luca — University of Parma (Italy)


Novel Methods in H.264/AVC (Inter Prediction, Data Hiding, Bit Rate Transcoding)

H.264 Advanced Video Coding has become the dominant video coding standard in the market, within a few years after the first version of the standard was completed by the ISO/IEC MPEG and the ITU-T VCEG groups in May 2003. That happened mainly due to the great coding efficiency of H.264. Compared to MPEG-2, the previous dominant standard, the H.264 compression ratio is about twice as higher for the same video quality. That makes H.264 ideal for a numerous of applications, such as video broadcasting, video streaming and video conferencing. However, the H.264 efficiency is achieved at the expense of the codec¢s complexity. H.264 complexity is about four times that of MPEG-2. As a consequence, many video coding issues, which have been addressed in previous standards, need to be re-considered. For example the H.264 encoding of a video in real time ...

Kapotas, Spyridon — Hellenic Open University


Nonlinear processing of non-Gaussian stochastic and chaotic deterministic time series

It is often assumed that interference or noise signals are Gaussian stochastic processes. Gaussian noise models are appealing as they usually result in noise suppression algorithms that are simple: i.e. linear and closed form. However, such linear techniques may be sub-optimal when the noise process is either a non-Gaussian stochastic process or a chaotic deterministic process. In the event of encountering such noise processes, improvements in noise suppression, relative to the performance of linear methods, may be achievable using nonlinear signal processing techniques. The application of interest for this thesis is maritime surveillance radar, where the main source of interference, termed sea clutter, is widely accepted to be a non-Gaussian stochastic process at high resolutions and/or at low grazing angles. However, evidence has been presented during the last decade which suggests that sea clutter may be better modelled as a ...

Cowper, Mark — University Of Edinburgh


Progressive visualization of incomplete sonar-data sets: from sea-bottom interpolation and segmentation to geometry extraction

This thesis describes a visualization pipeline for sonar profiling data that show reflections of multiple sediments in the sea bottom and that cover huge survey areas with many gaps. Visualizing such data is not trivial, because they may be noisy and because data sets may be very large. The developed techniques are: (1) Quadtree interpolation for estimating new sediment reflections, at all gaps in the longitude-latitude plane. The quadtree is used for guiding the 3D interpolation process: gaps become small at low spatial resolutions, where they can be filled by interpolating between available reflections. In the interpolation, the reflection data are cross correlated in order to construct continuity of multiple, sloping reflections. (2) Segmentation and boundary refinement in an octree in order to detect sediments in the sonar data. In the refinement, coarse boundaries are reclassified by filtering the data ...

Loke, Robert Edward — Delft University of Technology


Interpretable Machine Learning for Machine Listening

Recent years have witnessed a significant interest in interpretable machine learning (IML) research that develops techniques to analyse machine learning (ML) models. Understanding ML models is essential to gain trust in their predictions and to improve datasets, model architectures and training techniques. The majority of effort in IML research has been in analysing models that classify images or structured data and comparatively less work exists that analyses models for other domains. This research focuses on developing novel IML methods and on extending existing methods to understand machine listening models that analyse audio. In particular, this thesis reports the results of three studies that apply three different IML methods to analyse five singing voice detection (SVD) models that predict singing voice activity in musical audio excerpts. The first study introduces SoundLIME (SLIME), a method to generate temporal, spectral or time-frequency explanations ...

Mishra, Saumitra — Queen Mary University of London


Three dimensional shape modeling: segmentation, reconstruction and registration

Accounting for uncertainty in three-dimensional (3D) shapes is important in a large number of scientific and engineering areas, such as biometrics, biomedical imaging, and data mining. It is well known that 3D polar shaped objects can be represented by Fourier descriptors such as spherical harmonics and double Fourier series. However, the statistics of these spectral shape models have not been widely explored. This thesis studies several areas involved in 3D shape modeling, including random field models for statistical shape modeling, optimal shape filtering, parametric active contours for object segmentation and surface reconstruction. It also investigates multi-modal image registration with respect to tumor activity quantification. Spherical harmonic expansions over the unit sphere not only provide a low dimensional polarimetric parameterization of stochastic shape, but also correspond to the Karhunen-Lo´eve (K-L) expansion of any isotropic random field on the unit sphere. Spherical ...

Li, Jia — University of Michigan


Advanced Signal Processing Concepts for Multi-Dimensional Communication Systems

The widespread use of mobile internet and smart applications has led to an explosive growth in mobile data traffic. With the rise of smart homes, smart buildings, and smart cities, this demand is ever growing since future communication systems will require the integration of multiple networks serving diverse sectors, domains and applications, such as multimedia, virtual or augmented reality, machine-to-machine (M2M) communication / the Internet of things (IoT), automotive applications, and many more. Therefore, in the future, the communication systems will not only be required to provide Gbps wireless connectivity but also fulfill other requirements such as low latency and massive machine type connectivity while ensuring the quality of service. Without significant technological advances to increase the system capacity, the existing telecommunications infrastructure will be unable to support these multi-dimensional requirements. This poses an important demand for suitable waveforms with ...

Cheema, Sher Ali — Technische Universität Ilmenau


Non-rigid Registration-based Data-driven 3D Facial Action Unit Detection

Automated analysis of facial expressions has been an active area of study due to its potential applications not only for intelligent human-computer interfaces but also for human facial behavior research. To advance automatic expression analysis, this thesis proposes and empirically proves two hypotheses: (i) 3D face data is a better data modality than conventional 2D camera images, not only for being much less disturbed by illumination and head pose effects but also for capturing true facial surface information. (ii) It is possible to perform detailed face registration without resorting to any face modeling. This means that data-driven methods in automatic expression analysis can compensate for the confounding effects like pose and physiognomy differences, and can process facial features more effectively, without suffering the drawbacks of model-driven analysis. Our study is based upon Facial Action Coding System (FACS) as this paradigm ...

Savran, Arman — Bogazici University


Development of an automated neonatal EEG seizure monitor

Brain function requires a continuous flow of oxygen and glucose. An insufficient supply for a few minutes during the first period of life may have severe consequences or even result in death. This happens in one to six infants per 1000 live term births. Therefore, there is a high need for a method which can enable bedside brain monitoring to identify those neonates at risk and be able to start the treatment in time. The most important currently available technology to continuously monitor brain function is electroEncephaloGraphy (or EEG). Unfortunately, visual EEG analysis requires particular skills which are not always present round the clock in the Neonatal Intensive Care Unit (NICU). Even if those skills are available it is laborsome to manually analyse many hours of EEG. The lack of time and skill are the main reasons why EEG is ...

Deburchgraeve, Wouter — KU Leuven


Statistical methods using hydrodynamic simulations of stellar atmospheres for detecting exoplanets in radial velocity data

When the noise affecting time series is colored with unknown statistics, a difficulty for periodic signal detection is to control the true significance level at which the detection tests are conducted. This thesis investigates the possibility of using training datasets of the noise to improve this control. Specifically, for the case of regularly sampled observations, we analyze the performances of various detectors applied to periodograms standardized using the noise training datasets. Emphasis is put on sparse detection in the Fourier domain and on the limitation posed by the necessary finite size of the training sets available in practice. We study the resulting false alarm and detection rates and show that the proposed standardization leads, in some cases, to powerful constant false alarm rate tests. Although analytical results are derived in an asymptotic regime, numerical results show that the theory accurately ...

Sulis Sophia — Université Côte d’Azur


Change Detection Techniques for GNSS Signal-Level Integrity

The provision of accurate positioning is becoming essential to our modern society. One of the main reasons is the great success and ease of use of Global Navigation Satellite Systems (GNSSs), which has led to an unprecedented amount of GNSS-based applications. In particular, the current trend shows that a new era of GNSS-based applications and services is emerging. These applications are the so-called critical applications, in which the physical safety of users may be in danger due to a miss-performance of the positioning system. These applications have very stringent requirements in terms of integrity. Integrity is a measure of reliability and trust that can be placed on the information provided by the system. Integrity algorithms were originally designed for civil aviation in the 1980s. Unfortunately, GNSS-based critical applications are often associated with terrestrial environments and original integrity algorithms usually fail. ...

Egea-Roca, Daniel — Universitat Autònoma de Barcelona

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.