Acoustic echo reduction for multiple loudspeakers and microphones: Complexity reduction and convergence enhancement

Modern devices such as mobile phones, tablets or smart speakers are commonly equipped with several loudspeakers and microphones. If, for instance, one employs such a device for hands-free communication applications, the signals that are reproduced by the loudspeakers are propagated through the room and are inevitably acquired by the microphones. If no processing is applied, the participants in the far-end room receive delayed reverberated replicas of their own voice, which strongly degrades both speech intelligibility and user comfort. In order to prevent that so-called acoustic echoes are transmitted back to the far-end room, acoustic echo cancelers are commonly employed. The latter make use of adaptive filtering techniques to identify the propagation paths between loudspeakers and microphones. The estimated propagation paths are then employed to compute acoustic echo estimates, which are finally subtracted from the signals acquired by the microphones. In doing so, acoustic echoes can be effectively reduced before transmission. The problem of reducing echoes caused by the acoustic coupling between loudspeakers and microphones has been recurrently addressed over the past four decades. However, there are still open questions and, therefore, research opportunities in the field of acoustic echo reduction. Much of the work carried out nowadays is related to the complexity reduction and convergence enhancement of existing adaptive algorithms for acoustic echo cancellation. Also, mechanisms to overcome or mitigate the performance limitations of existing adaptive algorithms are still being developed. The latter include the development of residual echo estimators and suppressors. The reduction of the computational cost of acoustic echo cancellation becomes essential when the communication device comprises multiple loudspeakers or microphones. Since sub-band-domain adaptive filters are computationally less expensive than their time-domain counterparts, a straightforward solution to reduce the computational load of acoustic echo cancellation is to implement it in a sub-band domain. Moreover, if necessary, the computational complexity of sub-band-domain adaptive filters can be reduced even further by neglecting possible dependencies between sub-bands. However, this simplification leads to performance limitations, and needs to be analyzed and understood in order to be able to alleviate its consequences. Acoustic echo cancellation for multiple loudspeakers and/or microphones presents additional challenges. On the one hand, given a multiple loudspeaker setup, the convergence rate of multichannel acoustic echo cancellation is severely degraded if the signals reproduced by the loudspeakers are highly correlated. To overcome this performance deficiency, coherence reduction methods are commonly used to decorrelate the far-end signals before reproduction. However, this may degrade the quality of the signals reproduced by the loudspeakers, and a compromise between convergence enhancement and perceptual audio quality degradation has to be made. On the other hand, existing solutions for the combination of acoustic echo cancellation with multi-microphone noise reduction techniques fail to deliver a satisfactory performance as they either exhibit convergence deficiencies or imply a high computational cost. The focus of this thesis lies on the complexity reduction and convergence enhancement of acoustic echo cancellation for acoustic setups with either multiple loudspeakers or multiple microphones. Additionally, we provide mechanisms to estimate and reduce residual echoes that may remain after cancellation. First, acoustic echo cancellation employing discrete Fourier transform-based subband-domain adaptive filters is described. This allows us to identify the sub-band dependencies, and analyze the consequences of their neglection. Based on these analyses, we propose novel methods for both the complexity reduction and convergence enhancement of sub-band-domain acoustic echo cancellation. The proposed solutions are derived for single-loudspeaker single-microphone acoustic setups, but can be straightforwardly extended for more complex scenarios. Subsequently, the problem of reducing acoustic echoes given a multi-channel loudspeaker setup is studied, and an overview of existing solutions to enhance the convergence of multi-channel acoustic echo cancellation is provided. Among the existing coherence reduction methods, we set our focus on linear-periodically time-varying approaches. We develop a theoretical framework to analyze their coherence reduction capability and propose solutions to enhance their trade-off between convergence enhancement and subjective audio quality degradation. Since residual echoes commonly remain after cancellation, a mechanism is proposed to accurately compute multi-channel residual echo estimates regardless of the relation between the far-end signals. The obtained residual echo estimates are then employed to compute the gains of a residual echo suppression post-processor. Finally, a low-complexity method for multi-microphone acoustic echo cancellation is introduced, which computes the relation across microphones of acoustic echoes, instead of the acoustic echo propagation paths. In doing so, the length of the adaptive filters can be reduced without severely compromising the performance of multi-microphone acoustic echo cancellation. To provide a complete solution for the reduction of acoustic echoes given a multi-microphone setup, we propose to employ multi-microphone speech enhancement techniques to reduce residual echoes that remain after cancellation. Solutions for the estimation of multi-microphone residual echoes are proposed as well, including low-complexity alternatives thereof.

File Type: pdf
File Size: 12 MB
Publication Year: 2019
Author: Luis Valero, Maria
Supervisors: Emanuel Habets
Institution: International Audio Laboratories Erlangen
Keywords: Acoustic echo cancellation, Acoustic echo suppression, Frequency-domain adaptive filtering, Sub-band-domain adaptive filtering, Relative transfer function estimation