Papers

A Study on Acoustic Parameter Selection Strategies to Improve Deep Learning-Based Speech Synthesis

International Conference
2016~2020
작성자
한혜원
작성일
2019-11-01 16:48
조회
2871
Authors : Hyeonjoo Kang, Young-Sun Joo, Inseon Jang, Chunghyun Ahn, Hong-Goo Kang

Year : 2019

Publisher / Conference : APSIPA

In this paper, we investigate the variation in the performance of a deep learning-based speech synthesis (DLSS) system based on the configuration of output acoustic parameters. Our method is mainly applicable for vocoding-based statistical parametric speech synthesis (SPSS), which has advantages in lowresource scenarios. Given the independence assumption of the source-filter model for the spectral and fundamental frequency F0 parameters, we propose a reliable network architecture for training acoustic parameters. Particularly, the F0 parameter suffers from high fluctuation and an extremely low number of dimensions. To relieve these problems, we introduce a contextwindow approach. Furthermore, we apply data augmentation to the proposed structure to overcome a lack of training data, which is a frequent issue with multi-speaker TTS systems. Experimental results confirm the superiority of the proposed algorithm over conventional ones in both single-speaker and multi-speaker TTS setups
전체 369
117 International Conference Hyewon Han, Soo-Whan Chung, Hong-Goo Kang "MIRNet: Learning multiple identities representations in overlapped speech" in INTERSPEECH, 2020
116 International Conference Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang "Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework" in INTERSPEECH, 2020
115 International Conference Minh-Tri Ho, Jinyoung Lee, Bong-Ki Lee, Dong Hoon Yi, Hong-Goo Kang "A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement" in INTERSPEECH, 2020
114 International Journal Young-Sun Joo, Hanbin Bae, Young-Ik Kim, Hoon-Young Cho, Hong-Goo Kang "Effective Emotion Transplantation in an End-to-End Text-to-Speech System" in IEEE Access, vol.8, pp.161713-161719, 2020
113 International Conference Seyun Um, Sangshin Oh, Kyungguen Byun, Inseon Jang, ChungHyun Ahn, Hong-Goo Kang "Emotional Speech Synthesis with Rich and Granularized Control" in ICASSP, 2020
112 International Conference Min-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto, Frank Soong, Hong-Goo Kang "Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network" in ICASSP, 2020
111 International Conference Hyeonjoo Kang, Young-Sun Joo, Inseon Jang, Chunghyun Ahn, Hong-Goo Kang "A Study on Acoustic Parameter Selection Strategies to Improve Deep Learning-Based Speech Synthesis" in APSIPA, 2019
110 International Journal Ohsung Kwon, Inseon Jang, ChungHyun Ahn, Hong-Goo Kang "An Effective Style Token Weight Control Technique for End-to-End Emotional Speech Synthesis" in IEEE Signal Processing Letters, vol.26, issue 9, pp.1383-1387, 2019
109 International Conference Min-Jae Hwang, Hong-Goo Kang "Parameter enhancement for MELP speech codec in noisy communication environment" in INTERSPEECH, 2019
108 International Journal Jinkyu Lee, Hong-Goo Kang "A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.27, issue 6, pp.1098-1108, 2019