Papers

Deep Neural Network-Based Statistical Parametric Speech Synthesis System Using Improved Time-Frequency Trajectory Excitation Mo

International Conference
2011~2015
작성자
한혜원
작성일
2015-09-01 00:47
조회
1653
Authors : Eunwoo Song, Hong-Goo Kang

Year : 2015

Publisher / Conference : INTERSPEECH

This paper proposes a deep neural network (DNN)-based statistical parametric speech synthesis system using an improved time-frequency trajectory excitation (ITFTE) model. The ITFTE model, which efficiently reduces the parametric redundancy of a TFTE model, improved the perceptual quality of the vocoding process and the estimation accuracy of the training process. However, there remain problems related to training ITFTE parameters in a hidden Markov model (HMM) framework, such as inefficiency of representing cross-dimensional correlations between ITFTE parameters, over-smoothed outputs caused by statistical averaging, and an over-fitted model due to a decision tree-based state clustering paradigm. To alleviate these limitations, a centralized DNN replaces the decision trees of the HMM training process. Analysis of trainability confirms that the DNN training process improves the model accuracy, which results in improved perceptual quality of synthesized speech. Objective and subjective test results also verify that the proposed system performs better than the conventional HMM-based system.
전체 363
77 International Conference Kyungguen Byun, Eunwoo Song, Hong-goo Kang "A constrained two-layer compression technique for ECG waves" in Enegineering in Medicine and Biology Society (EMBC), 2015
76 International Conference Eunwoo Song, Hong-Goo Kang "Deep Neural Network-Based Statistical Parametric Speech Synthesis System Using Improved Time-Frequency Trajectory Excitation Mo" in INTERSPEECH, 2015
75 International Conference Heejin Ahn, Eunwoo Song, Won-Suk Jun, Hong-goo Kang "A Compression Algorithms for Hidden Markov Model-Based Speech Synthesis Systems" in ITC-CSCC, pp.942-945, 2015
74 International Conference JeeSok Lee, Sejin Oh, Hong-Goo Kang "Coherent channel based subband multichannel dereverberation" in ICASSP, pp.2704-2708, 2015
73 International Conference Eunwoo Song, Young-Sun Joo, Hong-Goo Kang "Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system" in ICASSP, 2015
72 International Conference Eunwoo Song, Hong-Goo Kang, Joonil Lee "Fixed-point implementation of MPEG-D unified speech and audio coding decoder" in 19th International Conference on Digital Signal Processing (DSP), pp.110-113, 2014
71 International Conference Soonho Baek, Hong-Goo Kang "Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment" in ICASSP, 2014
70 International Conference Ho Seon Shin, Hong-Goo Kang "Bone-Conduction Speech Enhancement using a Speaker-Independent Filter" in ICEIC, 2014
69 International Conference Soonho Baek, Hong-Goo Kang "Vector Taylor Series based HMM Adaptation for Generalized Cepstrum in Noisy Environment" in ASRU, 2013
68 International Conference Jung-Won Lee, Hong-Goo Kang, Samuel Kim, Yoonjae Lee "Detecting pathological speech using local and global characteristics of harmonic-to-noise ratio" in APSIPA, 2013