Signal-Dependent Spatial Filtering Based on Weighted-Orthogonal Beamformers in the Spherical Harmonic Domain
Spatial filtering with microphone arrays is a technique that can be utilized to obtain the signal of a target sound source from a specific direction. Typical approaches in the field of audio underperform in practical environments with multiple sound sources and diffuse sound. In this contribution, we propose a postfiltering technique to suppress the effect of interferers and diffuse sound. The proposed technique utilizes the cross-spectral estimates of the output of two beamformers to formulate a time-frequency soft masker. The beamformers' outputs are used only for parameter estimation and not for generating an audio signal. Two sets of beamformer weights, a constant and an adaptive, are applied to the microphone array signals for the parameter estimation. The weights of the constant beamformer are designed such that they provide a spatially narrow beam pattern that is time and frequency invariant, having a unity gain toward the direction of interest. The weights of the adaptive beamformer are formulated using linearly constrained optimization with the constraint of weighted orthogonality with respect to the constant beamformer weights, as well as the unity gain toward the look direction. The orthogonality constraint provides diffuse sound suppression while the unity gain distortionless response. The cross spectrum of these two beamformers provides the target energy at a given look direction for the post filter. The study focuses on compact microphone arrays with which the typical beamforming techniques feature a tradeoff between noise amplification and spatial selectivity, especially in the low frequency region. The proposed method is evaluated with instrumental measures and listening tests under different reverberation times, in dual and multitalker scenarios. The evaluation shows that the proposed method provides a better performance when compared with a previous state-of-the-art spatial filter based on cross-pattern coherence, a linearly constrained beamformer and a Wiener postfilter.
|Julkaisu||IEEE/ACM Transactions on Audio, Speech, and Language Processing|
|Tila||Julkaistu - 1 syyskuuta 2016|
|OKM-julkaisutyyppi||A1 Julkaistu artikkeli, soviteltu|