Papers

Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework

International Conference
2016~2020
작성자
한혜원
작성일
2020-10-01 16:52
조회
384
Authors : Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang

Year : 2020

Publisher / Conference : INTERSPEECH

Presentation : 구두

In this paper, we propose an effective training strategy to ex-tract robust speaker representations from a speech signal. Oneof the key challenges in speaker recognition tasks is to learnlatent representations or embeddings containing solely speakercharacteristic information in order to be robust in terms of intra-speaker variations. By modifying the network architecture togenerate both speaker-related and speaker-unrelated representa-tions, we exploit a learning criterion which minimizes the mu-tual information between these disentangled embeddings. Wealso introduce an identity change loss criterion which utilizes areconstruction error to different utterances spoken by the samespeaker. Since the proposed criteria reduce the variation ofspeaker characteristics caused by changes in background envi-ronment or spoken content, the resulting embeddings of eachspeaker become more consistent. The effectiveness of the pro-posed method is demonstrated through two tasks; disentangle-ment performance, and improvement of speaker recognition ac-curacy compared to the baseline model on a benchmark dataset,VoxCeleb1. Ablation studies also show the impact of each cri-terion on overall performance.
전체 326
68 International Conference You Jin Kim, Hee Soo Heo, Soo-Whan Chung, Bong-Jin Lee "End-to-end Lip Synchronisation Based on Pattern Classification" in IEEE Spoken Language Technology Workshop (SLT), 2020
67 International Conference Seong Min Kye, Yoohwan Kwon, Joon Son Chung "Cross Attentive Pooling for Speaker Verification" in IEEE Spoken Language Technology Workshop (SLT), 2020
66 International Conference Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis" in APSIPA (*awarded Best Paper), 2020
65 International Conference Hyeon-Kyeong Shin, Hyewon Han, Kyungguen Byun, Hong-Goo Kang "Speaker-invariant Psychological Stress Detection Using Attention-based Network" in APSIPA, 2020
64 International Conference Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang "LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis" in APSIPA, 2020
63 International Conference Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang "A Study on Conditional Features for a Flow-based Neural Vocoder" in Asilomar Conference on Signals, Systems, and Computers, 2020
62 International Conference Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang "FaceFilter: Audio-visual speech separation using still images" in INTERSPEECH (*awarded Best Student Paper), 2020
61 International Conference Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung "Seeing Voices and Hearing Voices: Learning Discriminative Embeddings Using Cross-Modal Self-Supervision" in INTERSPEECH, 2020
60 International Conference Hyewon Han, Soo-Whan Chung, Hong-Goo Kang "MIRNet: Learning multiple identities representations in overlapped speech" in INTERSPEECH, 2020
59 International Conference Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang "Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework" in INTERSPEECH, 2020