Efficient Perceptual Audio Coding Using Cosine and Sine Modulated Lapped Transforms

The increasing number of simultaneous input and output channels utilized in immersive audio configurations primarily in broadcasting applications has renewed industrial requirements for efficient audio coding schemes with low bit-rate and complexity. This thesis presents a comprehensive review and extension of conventional approaches for perceptual coding of arbitrary multichannel audio signals. Particular emphasis is given to use cases ranging from two-channel stereophonic to six-channel 5.1-surround setups with or without the application-specific constraint of low algorithmic coding latency. Conventional perceptual audio codecs share six common algorithmic components, all of which are examined extensively in this thesis. The first is a signal-adaptive filterbank, constructed using instances of the real-valued modified discrete cosine transform (MDCT), to obtain spectral representations of successive portions of the incoming discrete time signal. Within this MDCT spectral domain, various intra- and inter-channel optimizations, most of which are of ...

Helmrich, Christian R. — Friedrich-Alexander-Universität Erlangen-Nürnberg


Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg


Adaptive Calibration of Frequency Response Mismatches in Time-Interleaved Analog-to-Digital Converters

The performance of today's communication systems is highly dependent on the employed analog-to-digital converters (ADCs), and in order to provide more flexibility and precision for the emerging communication technologies, high-performance ADCs are required. In this regard, the time-interleaved operation of an array of ADCs (TI-ADC) can be a reasonable solution. A TI-ADC can increase its throughput by using M channel ADCs or subconverters in parallel and sampling the input signal in a time-interleaved manner. However, the performance of a TI-ADC badly suffers from the mismatches among the channel ADCs. The mismatches among channel ADCs distort the TI-ADC output spectrum by introducing spurious tones besides the actual signal components. This thesis deals with the adaptive background calibration of frequency-response mismatches in a TI-ADC. By modeling each channel ADC as a linear time-invariant system, we develop the continuous-time, discrete-time, and time-varying system ...

Saleem, Shahzad — Graz University of Technology


Speech derereverberation in noisy environments using time-frequency domain signal models

Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...

Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg


Time frequency modelling

The overriding aim of this thesis is to investigate the benefits of focusing time-frequency analysis on particular regions of the time-frequency plane. The thesis examines aspects of such a regionalisation in the analysis of both deterministic signals and stochastic processes. The majority of deterministic energetic time-frequency representations are non-parametric indicating the distribution of the energy of a signal in the time-frequency plane but providing no further information about the time-frequency structure. This thesis develops a semi-parametric time-frequency model to simultaneously describe the time-frequency energetic structure of a signal and provide an indication of its time-frequency complexity. The model aims to identify ‘timefrequency components’ within the signal to indicate how their energy is distributed in the time-frequency plane and thereby to probabilistically associate every location in the plane with each identified component. The thesis investigates a number of applications of the ...

Coates, Mark — University of Cambridge


OFDM Multi-User Communication Over Time-Variant Channels

Wireless broadband communications for users moving at vehicular speed is a cor- nerstone of future fourth generation (4G) mobile communication systems. We inves- tigate a multi-carrier (MC) code division multiple access (CDMA) system which is based on orthogonal frequency division multiplexing (OFDM). A spreading sequence is used in the frequency domain in order to distinguish individual users and to take advantage of the multipath diversity of the wireless channel. The transmission is block oriented. A block consists of OFDM pilot and OFDM data symbols. At pedestrian velocities the channel can be modelled as block fading. We ap- ply iterative multi-user detection and channel estimation. In iterative receivers soft symbols are derived from the output of an soft-input soft-output decoder. These soft symbols are used in order to reduce the interference from other users and to enhance the channel estimates. We ...

Zemen, T. — Vienna University of Technology


Sequential Bayesian Modeling of non-stationary signals

are involved until the development of Sequential Monte Carlo techniques which are also known as the particle filters. In particle filtering, the problem is expressed in terms of state-space equations where the linearity and Gaussianity requirements of the Kalman filtering are generalized. Therefore, we need information about the functional form of the state variations. In this thesis, we bring a general solution for the cases where these variations are unknown and the process distributions cannot be expressed by any closed form probability density function. Here, we propose a novel modeling scheme which is as unified as possible to cover all these problems. Therefore we study the performance analysis of our unifying particle filtering methodology on non-stationary Alpha Stable process modeling. It is well known that the probability density functions of these processes cannot be expressed in closed form, except for ...

Gencaga, Deniz — Bogazici University


Discrete Quadratic Time-Frequency Distributions: Definition, Computation, and a Newborn Electroencephalogram Application

Most signal processing methods were developed for continuous signals. Digital devices, such as the computer, process only discrete signals. This dissertation proposes new techniques to accurately define and efficiently implement an important signal processing method---the time--frequency distribution (TFD)---using discrete signals. The TFD represents a signal in the joint time--frequency domain. Because these distributions are a function of both time and frequency they, unlike traditional signal processing methods, can display frequency content that changes over time. TFDs have been used successfully in many signal processing applications as almost all real-world signals have time-varying frequency content. Although TFDs are well defined for continuous signals, defining and computing a TFD for discrete signals is problematic. This work overcomes these problems by making contributions to the definition, computation, and application of discrete TFDs. The first contribution is a new discrete definition of TFDs. A ...

O' Toole, John M. — University of Queensland


Compressive Sensing of Cyclostationary Propeller Noise

This dissertation is the combination of three manuscripts –either published in or submitted to journals– on compressive sensing of propeller noise for detection, identification and localization of water crafts. Propeller noise, as a result of rotating blades, is broadband and radiates through water dominating underwater acoustic noise spectrum especially when cavitation develops. Propeller cavitation yields cyclostationary noise which can be modeled by amplitude modulation, i.e., the envelope-carrier product. The envelope consists of the so-called propeller tonals representing propeller characteristics which is used to identify water crafts whereas the carrier is a stationary broadband process. Sampling for propeller noise processing yields large data sizes due to Nyquist rate and multiple sensor deployment. A compressive sensing scheme is proposed for efficient sampling of second-order cyclostationary propeller noise since the spectral correlation function of the amplitude modulation model is sparse as shown in ...

Fırat, Umut — Istanbul Technical University


Speech Enhancement Algorithms for Audiological Applications

The improvement of speech intelligibility is a traditional problem which still remains open and unsolved. The recent boom of applications such as hands-free communi- cations or automatic speech recognition systems and the ever-increasing demands of the hearing-impaired community have given a definitive impulse to the research in this area. This PhD thesis is focused on speech enhancement for audiological applications. Most of the research conducted in this thesis has been focused on the improvement of speech intelligibility in hearing aids, considering the variety of restrictions and limitations imposed by this type of devices. The combination of source separation techniques and spatial filtering with machine learning and evolutionary computation has originated novel and interesting algorithms which are included in this thesis. The thesis is divided in two main parts. The first one contains a preliminary study of the problem and a ...

Ayllón, David — Universidad de Alcalá


Sound Source Separation in Monaural Music Signals

Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, one-channel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of real-world sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses model-based inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is modeled ...

Virtanen, Tuomas — Tampere University of Technology


Signal Separation

The problem of signal separation is a very broad and fundamental one. A powerful paradigm within which signal separation can be achieved is the assumption that the signals/sources are statistically independent of one another. This is known as Independent Component Analysis (ICA). In this thesis, the theoretical aspects and derivation of ICA are examined, from which disparate approaches to signal separation are drawn together in a unifying framework. This is followed by a review of signal separation techniques based on ICA. Second order statistics based output decorrelation methods are employed to try to solve the challenging problem of separating convolutively mixed signals, in the context of mainly audio source separation and the Cocktail Party Problem. Various optimisation techniques are devised to implement second order signal separation of both artificially mixed signals and real mixtures. A study of the advantages and ...

Ahmed, Alijah — University of Cambridge


Pre-processing of Speech Signals for Robust Parameter Estimation

The topic of this thesis is methods of pre-processing speech signals for robust estimation of model parameters in models of these signals. Here, there is a special focus on the situation where the desired signal is contaminated by colored noise. In order to estimate the speech signal, or its voiced and unvoiced components, from a noisy observation, it is important to have robust estimators that can handle colored and non-stationary noise. Two important aspects are investigated. The first one is a robust estimation of the speech signal parameters, such as the fundamental frequency, which is required in many contexts. For this purpose, fast estimation methods based on a simple white Gaussian noise (WGN) assumption are often used. To keep using those methods, the noisy signal can be pre-processed using a filter. If the colored noise is modelled as an autoregressive ...

Esquivel Jaramillo, Alfredo — Aalborg University


Transformation methods in signal processing

This dissertation is concerned with the application of the theory of rational functions in signal processing. The PhD thesis summarizes the corresponding results of the author’s research. Since the systems of rational functions are defined by the collection of inverse poles with multiplicities, the following parameters should be determined: the number, the positions and the multiplicities of the inverse poles. Therefore, we develop the hyperbolic variant of the so-called Nelder–Mead and the particle swarm optimization algorithm. In addition, the latter one is integrated into a more general multi-dimensional framework. Furthermore, we perform a detailed stability and error analysis of these methods. We propose an electrocardiogram signal generator based on spline interpolation. It turns to be an efficient tool for testing and evaluating signal models, filtering techniques, etc. In this thesis, the synthesized heartbeats are used to test the diagnostic distortion ...

Kovács, Péter — Eötvös L. University, Budapest, Hungary


Ultra low-power biomedical signal processing: an analog wavelet filter approach for pacemakers

The purpose of this thesis is to describe novel signal processing methodologies and analog integrated circuit techniques for low-power biomedical systems. Physiological signals, such as the electrocardiogram (ECG), the electroencephalogram (EEG) and the electromyogram (EMG) are mostly non-stationary. The main difficulty in dealing with biomedical signal processing is that the information of interest is often a combination of features that are well localized temporally (e.g., spikes) and others that are more diffuse (e.g., small oscillations). This requires the use of analysis methods sufficiently versatile to handle events that can be at opposite extremes in terms of their time-frequency localization. Wavelet Transform (WT) has been extensively used in biomedical signal processing, mainly due to the versatility of the wavelet tools. The WT has been shown to be a very efficient tool for local analysis of nonstationary and fast transient signals due ...

Haddad, Sandro Augusto Pavlík — Delft University of Technology

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.