Audio Signal Processing for Binaural Reproduction with Improved Spatial Perception

Binaural technology aims to reproduce three-dimensional auditory scenes with a high level of realism by providing the auditory display with spatial hearing information. This technology has various applications in virtual acoustics, architectural acoustics, telecommunication and auditory science. One key element in binaural technology is the actual binaural signals, produced by filtering a sound-field with free-field head related transfer functions (HRTFs). With the increased popularity of spherical microphone arrays for sound-field recording, methods have been developed for rendering binaural signals from these recordings. The use of spherical arrays naturally leads to processing methods that are formulated in the spherical harmonics (SH) domain. For accurate SH representation, high-order functions, of both the sound-field and the HRTF, are required. However, a limited number of microphones, on one hand, and challenges in acquiring high resolution individual HRTFs, on the other hand, impose limitations on the measurement of the functions, which lead to order-limited binaural reproduction, also referred to as higher-order Ambisonics. The limited spatial resolution of the reproduced signal due to order-limited reproduction has been previously investigated perceptually, showing spatial perception ramifications, such as poor source localization, limited externalization and distorted timbre. However, the underlying causes of these ramifications has not been studied comprehensively, and the effect of limited sampling of both the sound-field and the HRTF on the reproduced binaural signals has not been fully addressed. Furthermore, current solutions for low-order binaural reproduction are limited in providing high-fidelity spatial sound. In view of the limitations of current methods and present understanding, this work aims to advance the state-of-the-art in two main directions: (i) to understand better the limitations of low-order binaural reproduction by investigating the errors due to both the order-limited sound-field and sparse sampled HRTFs; and (ii) to improve the spatial perception of the reproduced binaural signals by developing methods that overcome the limitations of low- order reproduction. The aims of the thesis are realized through a collection of papers, which comprise this thesis. First, conditions for the joint sampling of sound-fields and HRTFs for order-limited binaural reproduction are derived. These provide insight into the limitations of sampling of both functions, and improve our understanding of the effect of spatial aliasing on the reproduced low-order signals. Then, the issue of the loss of energy at high frequencies when producing binaural signals that are order-truncated in the SH domain is addressed. A method is then developed to recover the spectral energy at high frequencies, thus improving perception. Finally, the limitations due to sparse measurement grids of the HRTFs are studied by analyzing both the aliasing and the truncation errors caused by the limited-order SH representation. This offers explanations for the perceptual ramifications of loudness instability. Then, a new efficient representation of HRTFs is developed, called ear-alignment, which significantly reduces errors in the reconstruction of sparse HRTFs. The incorporation of the proposed efficient representation into binaural reproduction is then presented, yielding the Bilateral Ambisonics reproduction concept, which leads to substantial improvement in the perception of order-limited binaural signals.

File Type: pdf
File Size: 15 MB
Publication Year: 2020
Author: Ben-Hur, Zamir
Supervisors: Boaz Rafaely
Institution: Ben-Gurion University of the Negev
Keywords: 3D audio, spatial audio, spatial hearing, spatial sound perception, binaural reproduction, binaural signals, virtual sound, head-related transfer function, spherical harmonics, spherical microphone arrays, Ambisonics