Papers

A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems

International Journal
2016~2020
작성자
이진영
작성일
2019-06-01 22:12
조회
1656
Authors : Jinkyu Lee, Hong-Goo Kang

Year : 2019

Publisher / Conference : IEEE/ACM Transactions on Audio, Speech, and Language Processing

Volume : 27, issue 6

Page : 1098-1108

This paper presents a joint learning algorithm for complex-valued time-frequency (T-F) masks in single-channel speech enhancement systems. Most speech enhancement algorithms operating in a single-channel microphone environment aim to enhance the magnitude component in a T-F domain, while the input noisy phase component is used directly without any processing. Consequently, the mismatch between the processed magnitude and the unprocessed phase degrades the sound quality. To address this issue, a learning method of targeting a T-F mask that is defined in a complex domain has recently been proposed. However, due to a wide dynamic range and an irregular spectrogram pattern of the complex-valued T-F mask, the learning process is difficult even with a large-scale deep learning network. Moreover, the learning process targeting the T-F mask itself does not directly minimize the distortion in spectra or time domains. In order to address these concerns, we focus on three issues: 1) an effective estimation of complex numbers with a wide dynamic range; 2) a learning method that is directly related to speech enhancement performance; and 3) a way to resolve the mismatch between the estimated magnitude and phase spectra. In this study, we propose objective functions that can solve each of these issues and train the network by minimizing them with a joint learning framework. The evaluation results demonstrate that the proposed learning algorithm achieves significant performance improvement in various objective measures and subjective preference listening test.
전체 355
54 International Journal Seung-Chul Shin, Jinkyu Lee, Soyeon Choe, Hyuk In Yang, Jihee Min, Ki-Yong Ahn, Justin Y. Jeon, Hong-Goo Kang "Dry Electrode-Based Body Fat Estimation System with Anthropometric Data for Use in a Wearable Device" in Sensors, vol.19, issue 9, 2019
53 International Journal Jinkyu Lee, Jan Skoglund, Turaj Shabestary, Hong-Goo Kang "Phase-Sensitive Joint Learning Algorithms for Deep Learning-Based Speech Enhancement" in IEEE Signal Processing Letters, vol.25, issue 8, pp.1276-1280, 2018
52 International Journal JeeSok Lee, Soo-Whan Chung, Min-Seok Choi, Hong-Goo Kang "Generic uniform search grid generation algorithm for far-field source localization" in The Journal of the Acoustical Society of America, vol.143, 2018
51 International Journal Min-Jae Hwang, JeeSok Lee, MiSuk Lee, Hong-Goo Kang "SVD-Based Adaptive QIM Watermarking on Stereo Audio Signals" in IEEE Transactions on Multimedia, vol.20, issue 1, pp.45-54, 2018
50 International Journal Eunwoo Song, Frank K. Soong, Hong-Goo Kang "Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue 11, pp.2152-2161, 2017
49 International Journal Ho Seon Shin, Tim Fingscheidt, Hong-Goo Kang "A Priori SNR Estimation Using Air- and Bone-Conduction Microphones" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue 11, pp.2015-2025, 2015
48 International Journal Taegyu Lee, Hyun Oh Oh, Jeongil Seo, Young-Cheol Park, Dae Hee Youn "Scalable Multiband Binaural Renderer for MPEG-H 3D Audio" in IEEE Journal of Selected Topics in Signal Processing, vol.9, issue 5, pp.907-920, 2015
47 International Journal Taegyu Lee, Yonghyun Baek, Young-Cheol Park, Dae Hee Youn "Stereo upmix-based binaural auralization for mobile devices" in IEEE Transactions on Consumer Electronics, vol.60, issue 3, pp.411-419, 2014
46 International Journal Soonho Baek, Hong-Goo Kang "Selection of spectral compressive operator for vector Taylor series-based model adaptation in noisy environments" in The Journal of the Acoustical Society of America, vol.135, 2014
45 International Journal Jae-Mo Yang, Hong-Goo Kang "Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue 3, pp.608-619, 2014