A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems

International Conference
2018-09-01 16:37
Authors : Min-Jae Hwang, Eunwoo Song, Jin-Seob Kim, Hong-Goo Kang

Year : 2018

Publisher / Conference : INTERSPEECH

In this paper, we propose a unified training framework for the generation of glottal signals in deep learning (DL)-based parametric speech synthesis systems. The glottal vocoding-based speech synthesis system, especially the modeling-by-generation (MbG) structure that we proposed recently, significantly improves the naturalness of synthesized speech by faithfully representing the noise component of the glottal excitation with an additional DL structure. Because the MbG method introduces a multistage processing pipeline, however, its training process is complicated and inefficient. To alleviate this problem, we propose a unified training approach that directly generates speech parameters by merging all the required models, such as acoustic, glottal and noise models into a single unified network. Considering the fact that noise analysis should be performed after training the glottal model, we also propose a stochastic noise analysis method that enables noise modeling to be included in the unified training process by iteratively analyzing the noise component in every epoch. Both objective and subjective test results verify the superiority of the proposed algorithm compared to conventional methods.
전체 319
279 International Conference Min-Jae Hwang, Eunwoo Song, Jin-Seob Kim, Hong-Goo Kang "A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems" in INTERSPEECH, 2018
278 Domestic Conference 양원, 정수환, 강홍구 "비학습 데이터 적응화 기법을 이용한 딥러닝 기반 한국어 음성 인식 기술" in 한국음향학회 추계발표대회, 2018
277 Domestic Conference 최소연, 정수환, 강홍구 "임베딩 매트릭스를 기반으로 한 비정상적 잡음 제거 알고리즘의 분석과 딥러닝 음질개선 방법들과의 성능비교" in 한국음향학회 추계발표대회, 2018
276 International Journal Jinkyu Lee, Jan Skoglund, Turaj Shabestary, Hong-Goo Kang "Phase-Sensitive Joint Learning Algorithms for Deep Learning-Based Speech Enhancement" in IEEE Signal Processing Letters, vol.25, issue 8, pp.1276-1280, 2018
275 International Conference Haemin Yang, Soyeon Choe, Keulbit Kim, Hong-Goo Kang "Deep learning-based speech presence probability estimation for noise PSD estimation in single-channel speech enhancement" in ICSigSys, 2018
274 International Conference Min-Jae Hwang, Eunwoo Song, Kyungguen Byun, Hong-Goo Kang "Modeling-by-Generation-Structured Noise Compensation Algorithm for Glottal Vocoding Speech Synthesis System" in ICASSP, 2018
273 International Conference Jinyoung Lee, Chahyeon Eom, Youngsu Kwak, Hong-Goo Kang, Chungyoung Lee "DNN-based Wireless Positioning in An Outdoor Environment" in ICASSP, 2018
272 International Conference Seung-chul Shin, Sangyeop Lee, Taeho Lee, Kyoungwoo Lee, Yong Seung Lee, Hong-Goo Kang "Two electrode based healthcare device for continuously monitoring ECG and BIA signals" in BHI, 2018
271 International Journal JeeSok Lee, Soo-Whan Chung, Min-Seok Choi, Hong-Goo Kang "Generic uniform search grid generation algorithm for far-field source localization" in The Journal of the Acoustical Society of America, vol.143, 2018
270 International Journal Min-Jae Hwang, JeeSok Lee, MiSuk Lee, Hong-Goo Kang "SVD-Based Adaptive QIM Watermarking on Stereo Audio Signals" in IEEE Transactions on Multimedia, vol.20, issue 1, pp.45-54, 2018