An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech Enhancement

Abstract

In multichannel speech enhancement (SE) systems based on beamforming, deep neural networks (DNNs) are often used to estimate beamformer weights directly. This approach, however, may not generalize well to new acoustic conditions. Alternatively, DNNs can predict T-F masks for speech and noise patterns that can be used with statistical beamforming. This approach is robust, but its performance is constrained by the later component as relying on certain modeling assumptions, e.g., covariance-based modeling in the minimum-variance-distortionless-response (MVDR) beamformer. In this paper, we propose a novel integration of the two types of methodology by introducing an intra-MVDR module embedded in the U-Net architecture that combines the merits of both, i.e., effectiveness and robustness. Simulation results show that the proposed MVDR-embedded U-Net leads to SE improvements that are not achievable by simply enlarging the network with baseline approaches.

View Publication

Author: Ching-Hua Lee, Kashyap Patel, Chouchang Yang, Yilin Shen , Hongxia Jin

Published: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date: Apr 14, 2024

An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech Enhancement

Abstract

Join us