Papers

Style Modeling for Multi-Speaker Articulation-to-Speech

International Conference
작성자
이지현
작성일
2023-02-21 14:04
조회
892
Authors : Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang

Year : 2023

Publisher / Conference : ICASSP

Research area : Speech Signal Processing, Speech Synthesis

Related project : [RCMS]서울대학교 산학협력단/ 음성의사소통을 위한 완전이식형 폐회로 Brain to X 개발(1/5,3/3)

Presentation : Poster

In this paper, we propose a neural articulation-to-speech (ATS) framework that synthesizes high-quality speech from articulatory signal in a multi-speaker situation. Most conventional ATS approaches only focus on modeling contextual information of speech from a single speaker’s articulatory features. To explicitly represent each speaker’s speaking style as well as the contextual information, our proposed model estimates style embeddings, guided from the essential speech style attributes such as pitch and energy. We adopt convolutional layers and transformer-based attention layers for our model to fully utilize both local and global information of articulatory signals, measured by electromagnetic articulography (EMA). Our model significantly improves the quality of synthesized speech compared to the baseline in terms of objective and subjective measurements in the Haskins dataset.
전체 360
137 International Conference Hyungchan Yoon, Changhwan Kim, Eunwoo Song, Hyun-Wook Yoon, Hong-Goo Kang "Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech" in INTERSPEECH, 2023
136 International Conference Doyeon Kim, Soo-Whan Chung, Hyewon Han, Youna Ji, Hong-Goo Kang "HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders" in INTERSPEECH, 2023
135 International Conference Zhenyu Piao, Miseul Kim, Hyungchan Yoon, Hong-Goo Kang "HappyQuokka System for ICASSP 2023 Auditory EEG Challenge" in ICASSP, 2023
134 International Conference Byeong Hyeon Kim, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Codec with Psychoacoustic Loss and Discriminator" in ICASSP, 2023
133 International Conference Hyungseob Lim, Jihyun Lee, Byeong Hyeon Kim, Inseon Jang, Hong-Goo Kang "End-to-End Neural Audio Coding in the MDCT Domain" in ICASSP, 2023
132 International Conference Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang "Style Modeling for Multi-Speaker Articulation-to-Speech" in ICASSP, 2023
131 International Conference Hyeon-Kyeong Shin, Hyewon Han, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang "Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting" in INTERSPEECH (*Best Student Paper Finalist), 2022
130 International Conference Changhwan Kim, Seyun Um, Hyungchan Yoon, Hong-goo Kang "FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS" in INTERSPEECH, 2022
129 International Conference Miseul Kim, Zhenyu Piao, Seyun Um, Ran Lee, Jaemin Joh, Seungshin Lee, Hong-Goo Kang "Light-Weight Speaker Verification with Global Context Information" in INTERSPEECH, 2022
128 International Conference Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang "Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement" in ICASSP, 2022