Efficient parametric modeling, identification and equalization of room acoustics
Room acoustic signal enhancement (RASE) applications, such as digital equalization, acoustic echo and feedback cancellation, which are commonly found in communication devices and audio equipment, aim at processing the acoustic signals with the final goal of improving the perceived sound quality in rooms. In order to do so, signal processing algorithms require the acoustic response of the room to be represented by means of parametric models and to be identified from the input and output signals of the room acoustic system. In particular, a good model should be both accurate, thus capturing those features of room acoustics that are physically and perceptually most relevant, and efficient, so that it can be implemented as a digital filter and used in practical signal processing tasks. This thesis addresses the fundamental question in room acoustic signal processing concerning the appropriateness of different parametric models for room acoustics. Most room acoustic signal processing algorithms rely on the simplicity and versatility of all-zero (AZ) models, which however may require a large number of parameters to approximate a room impulse response (RIR) with high accuracy. The main goal of this thesis is then to develop parametric models with the same modeling accuracy as AZ models, but with lower model complexity. Pole-zero (PZ) models and especially models based on orthonormal basis functions (OBFs) are investigated. The properties of OBF models, such as orthogonality and scalability, are exploited in the development of iterative scalable algorithms, which provide numerically well-conditioned estimates of the model parameters. The nonlinear problem of estimating the pole parameters from measured RIRs is approached with a grid-search method, which not only provides stable and accurate estimates, but also enables an arbitrary allocation of the spectral resolution. A reduction in the number of parameters of 50% compared to AZ models is achieved in full-band, and up to 75% in the low and mid frequencies. A further reduction is obtained by estimating a set of poles common to multiple RIRs, based on the physically-motivated assumption of the poles being independent of the loudspeaker and microphone positions. In many algorithms for RASE applications, the RIR has to be identified from speech or audio input-output signals, typically using adaptive digital filters. Fixed-poles infinite impulse response (IIR) adaptive filters based on OBF models, or simply OBF filters, present interesting properties in terms of error performance and convergence of the filter coefficients, which are dependent on the number and position of the fixed poles. A grid-search approach has been adopted for the pole estimation also in the multi-channel identification case, thus avoiding the use of recursive nonlinear algorithms. The resulting iterative algorithm adapts the linear coefficients of the multi-channel OBF filter using a modified version of the normalized least mean squares (NLMS) algorithm, meant to deal with issues at very low model orders, whereas the standard NLMS is used to track correlation parameters, based on which a new pair of complexconjugate poles is fixed in the filter. A significant improvement in terms of identification accuracy and convergence compared to finite impulse response (FIR) filters, as well as robustness with respect to changes in the microphone positions, is observed at low frequencies, especially in small or damped rooms. The reduction in the filter order and the use of a common set of poles also helps in addressing some of the issues encountered in RASE applications, such as echo path undermodeling in acoustic echo cancellation, or frequency allocation in inverse filtering for digital equalization. Particular attention is addressed to the low-frequency region of modal resonances, where the acoustics of small rooms is typically more problematic. In this regard, a series of acoustic measurements have been performed in a rectangular room using a subwoofer as sound source. The issues of measuring RIRs at low frequencies, mostly related to high ambient noise and to the nonlinear distortions produced by the subwoofer, are addressed and partially solved by means of the exponential sine-sweep method, a careful calibration of the measuring equipment and postprocessing operations. Moreover, a novel procedure for estimating the frequency-dependent reverberation time is suggested. Finally, two applications in the context of digital equalizationn are presented. The first introduces a design procedure for a low-order equalizer using parametric IIR filters with improved mathematical tractability of the equalization problem and other desirable properties, which is used for minimum-phase equalization of loudspeaker and room responses. The second application describes the implementation of an existing solution for nonminimum-phase multi-channel equalization of car cabin acoustics, which involves the modeling of different aspects of the acoustic transfer functions. The common-poles version of a modeling algorithm for PZ models is derived, and adapted for estimating excess-phase zeros, which are then used to compensate for nonminimum-phase distortions.
