Authors : Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang
Year : 2022
Publisher / Conference : ICASSP
Research area : Speech Signal Processing, Speech Enhancement
Related project : 음성인식 성능 향상을 위한 원단 신호 전처리 및 키워드 인식 알고리즘 개발
Presentation : Poster
In this paper, we propose an effective phase reconstruction strategy for speech enhancement in noisy and reverberant environments. In neural network-based speech enhancement systems, various forms of phase information were explicitly or implicitly included in the training loss term. However, the impact of quality improvement for enhanced speech was not significant, and there was no clear analysis on the relationship between the type of phase loss and enhanced speech quality. We propose a novel strategy using phase continuity loss, delving into the relative phase estimation. From various experiments on measuring the effectiveness of magnitude and phase related loss terms for noisy and reverberant signals, we show that the proposed phase continuity loss term plays a crucial role in speech enhancement. Based on these results, we conclude that phase continuity loss is useful for training a speech enhancement model, especially denoising task, and magnitude and phase related loss terms should be properly weighted depending on the type of signal distortion.