Papers

Deep learning-based speech presence probability estimation for noise PSD estimation in single-channel speech enhancement

International Conference
2016~2020
작성자
한혜원
작성일
2018-05-01 16:35
조회
1447
Authors : Haemin Yang, Soyeon Choe, Keulbit Kim, Hong-Goo Kang

Year : 2018

Publisher / Conference : ICSigSys

In single-channel speech enhancement, it is essential to determine noise reduction factors to successfully remove noise while minimizing speech distortion. These factors are typically set by a function of noise power spectral density (PSD) in time frequency domain, and the state-of-the-art algorithm also introduces additional processes to estimate speech presence probability (SPP) to further enhance the estimation. Due to many tuning parameters, however, it is not easy to implement an algorithm that reliably estimates SPP in noise varying environment. We proposed a combination of deep learning network and an effective training method to enhance the performance of the SPP estimation module. The proposed approach is regarded as a hybrid approach, with the noise reduction factor still estimated by the conventional statistic-based single channel enhancement algorithms. The advantages and disadvantages of the proposed approach compared to deep learning approach of single channel speech enhancement are also investigated.
전체 355
315 International Conference Kihyuk Jeong, Huu-Kim Nguyen, Hong-Goo Kang "A Fast and Lightweight Text-To-Speech Model with Spectrum and Waveform Alignment Algorithms" in EUSIPCO, 2021
314 International Conference Jiyoung Lee*, Soo-Whan Chung*, Sunok Kim, Hong-Goo Kang**, Kwanghoon Sohn** "Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation" in CVPR, 2021
313 International Conference Zainab Alhakeem, Hong-Goo Kang "Confidence Learning from Noisy Labels for Arabic Dialect Identification" in ITC-CSCC, 2021
312 International Conference Huu-Kim Nguyen, Kihyuk Jeong, Hong-Goo Kang "Fast and Lightweight Speech Synthesis Model based on FastSpeech2" in ITC-CSCC, 2021
311 International Conference Yoohwan Kwon*, Hee-Soo Heo*, Bong-Jin Lee, Joon Son Chung "The ins and outs of speaker recognition: lessons from VoxSRC 2020" in ICASSP, 2021
310 International Conference You Jin Kim, Hee Soo Heo, Soo-Whan Chung, Bong-Jin Lee "End-to-end Lip Synchronisation Based on Pattern Classification" in IEEE Spoken Language Technology Workshop (SLT), 2020
309 International Conference Seong Min Kye, Yoohwan Kwon, Joon Son Chung "Cross Attentive Pooling for Speaker Verification" in IEEE Spoken Language Technology Workshop (SLT), 2020
308 International Conference Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis" in APSIPA (*awarded Best Paper), 2020
307 International Conference Hyeon-Kyeong Shin, Hyewon Han, Kyungguen Byun, Hong-Goo Kang "Speaker-invariant Psychological Stress Detection Using Attention-based Network" in APSIPA, 2020
306 International Conference Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang "LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis" in APSIPA, 2020