Papers

Emotional Speech Synthesis Based on Style Embedded Tacotron2 Framework

International Conference
2016~2020
작성자
한혜원
작성일
2019-06-01 16:43
조회
1324
Authors : Ohsung Kwon, Inseon Jang, ChungHyun Ahn, Hong-Goo Kang

Year : 2019

Publisher / Conference : ITC-CSCC

In this paper, we propose a speech synthesis system that effectively generates multiple types of emotional speech using the concept of global style token (GST); where the emotion-related style information is presented by an additional style embedding vector. Although the GST is not a new idea, no one has been utilized the idea for an emotional speech synthesis task. We explicitly combine the GST idea with the Tacotron2 framework to implement an emotional text-to-speech system. The analysis results demonstrate that the proposed GST structure successfully transfers various types of emotional information to the synthesized speech. Subjective listening tests to evaluate the naturalness and emotional expression of synthesized speech are conducted to verify the superiority of the proposed algorithm.
전체 355
305 International Conference Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang "A Study on Conditional Features for a Flow-based Neural Vocoder" in Asilomar Conference on Signals, Systems, and Computers, 2020
304 International Conference Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang "FaceFilter: Audio-visual speech separation using still images" in INTERSPEECH (*awarded Best Student Paper), 2020
303 International Conference Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung "Seeing Voices and Hearing Voices: Learning Discriminative Embeddings Using Cross-Modal Self-Supervision" in INTERSPEECH, 2020
302 International Conference Hyewon Han, Soo-Whan Chung, Hong-Goo Kang "MIRNet: Learning multiple identities representations in overlapped speech" in INTERSPEECH, 2020
301 International Conference Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang "Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework" in INTERSPEECH, 2020
300 International Conference Minh-Tri Ho, Jinyoung Lee, Bong-Ki Lee, Dong Hoon Yi, Hong-Goo Kang "A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement" in INTERSPEECH, 2020
299 International Journal Young-Sun Joo, Hanbin Bae, Young-Ik Kim, Hoon-Young Cho, Hong-Goo Kang "Effective Emotion Transplantation in an End-to-End Text-to-Speech System" in IEEE Access, vol.8, pp.161713-161719, 2020
298 Domestic Journal 권유환, 정수환, 강홍구 "화자 인식을 위한 적대학습 기반음성 분리 프레임워크에 대한 연구" in 한국음향학회지, vol.39, 제 5호, pp.447-453, 2020
297 Domestic Conference 오태양, 정기혁, 강홍구 "화자 및 발화 스타일 임베딩을 통한 다화자 음성합성 시스템 음질 향상" in 전자공학회 하계학술대회, pp.980-982, 2020
296 Domestic Conference 이성현, 강홍구 "딥러닝 기반 종단 간 다채널 음질 개선 알고리즘" in 전자공학회 하계학술대회, pp.968-970, 2020