Papers

Deep bi-directional long short-term memory based speech enhancement for wind noise reduction

International Conference
2016~2020
작성자
한혜원
작성일
2017-03-01 16:30
조회
1390
Authors : Jinkyu Lee, Keulbit Kim, Turaj Shabestary, Hong-Goo Kang

Year : 2017

Publisher / Conference : HSCMA

In this paper, we propose a new recurrent neural network (RNN)-based single-channel speech enhancement framework for off-line wind noise reduction. To adequately represent highly non-stationary characteristics of wind noise, we first adopt a deep bi-directional long short-term memory (DBLSTM) structure. However, its enhanced output becomes muffled due to the spectral over-smoothing effect. To overcome this problem, we propose a new structure of DBLSTM-based speech enhancement system that internally incorporates the speech and noise power estimation processes in the spectral filtering framework. Furthermore, we propose a structure with an additional internal constraint of minimizing log a priori SNR, which provides efficient learning mechanism. Experimental results show that the proposed method improves source-to-distortion ratio (SDR) by 6.9 dB and perceptual evaluation of speech quality (PESQ) by 0.24 in comparison to the conventional DBLSTM-based system.
전체 355
112 International Conference Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang "LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis" in APSIPA, 2020
111 International Conference Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang "A Study on Conditional Features for a Flow-based Neural Vocoder" in Asilomar Conference on Signals, Systems, and Computers, 2020
110 International Conference Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang "FaceFilter: Audio-visual speech separation using still images" in INTERSPEECH (*awarded Best Student Paper), 2020
109 International Conference Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung "Seeing Voices and Hearing Voices: Learning Discriminative Embeddings Using Cross-Modal Self-Supervision" in INTERSPEECH, 2020
108 International Conference Hyewon Han, Soo-Whan Chung, Hong-Goo Kang "MIRNet: Learning multiple identities representations in overlapped speech" in INTERSPEECH, 2020
107 International Conference Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang "Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework" in INTERSPEECH, 2020
106 International Conference Minh-Tri Ho, Jinyoung Lee, Bong-Ki Lee, Dong Hoon Yi, Hong-Goo Kang "A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement" in INTERSPEECH, 2020
105 International Conference Seyun Um, Sangshin Oh, Kyungguen Byun, Inseon Jang, ChungHyun Ahn, Hong-Goo Kang "Emotional Speech Synthesis with Rich and Granularized Control" in ICASSP, 2020
104 International Conference Min-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto, Frank Soong, Hong-Goo Kang "Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network" in ICASSP, 2020
103 International Conference Hyeonjoo Kang, Young-Sun Joo, Inseon Jang, Chunghyun Ahn, Hong-Goo Kang "A Study on Acoustic Parameter Selection Strategies to Improve Deep Learning-Based Speech Synthesis" in APSIPA, 2019