Papers

SC-ERM: Speaker-Centric Learning for Speech Emotion Recognition

International Conference
2021~
작성자
dsp
작성일
2024-01-22 16:12
조회
1538
Authors : Juhwan Yoon, Seyun Um, Woo-Jin Chung, Hong-Goo Kang

Year : 2024

Publisher / Conference : International Conference on Electronics, Information, and Communication (ICEIC)

Research area : Speech Signal Processing, Etc

Presentation/Publication date : 2024.01.29

Presentation : Poster

We propose a novel deep learning-based model for speech emotion recognition, SC-ERM, which focuses on speakercentric learning. This model effectively estimates emotions and demonstrates the ability to generalize to unseen speakers. Our proposed model utilizes speaker-specific emotion characteristics in two steps: first, it extracts emotion representations using an emotion encoder, and second, it employs speaker-centric learning by incorporating speaker style embeddings as a condition through a speaker mask generator. We evaluate our model’s performance using an emotional dataset and find that it demonstrates outstanding performance in recognizing emotional states. Notably, it achieves a 9.2% relative improvement in accuracy compared to the baseline when classifying emotions for speakers not seen during training. Overall, our model demonstrates promising performance in accurately identifying emotions across a range of emotional expressions, irrespective of the speakers involved.
전체 367
150 International Conference Yeona Hong, Miseul Kim, Woo-Jin Chung, Hong-Goo Kang "Contextual Learning for Missing Speech Automatic Speech Recognition" in International Conference on Electronics, Information, and Communication (ICEIC), 2024
149 International Conference Juhwan Yoon, Seyun Um, Woo-Jin Chung, Hong-Goo Kang "SC-ERM: Speaker-Centric Learning for Speech Emotion Recognition" in International Conference on Electronics, Information, and Communication (ICEIC), 2024
148 International Conference Hejung Yang, Hong-Goo Kang "On Fine-Tuning Pre-Trained Speech Models With EMA-Target Self-Supervised Loss" in ICASSP, 2024
147 International Conference Hong-Goo Kang, W. Bastiaan Kleijn, Jan Skoglund, Michael Chinen "Convolutional Transformer for Neural Speech Coding" in Audio Engineering Society Convention, 2023
146 International Conference Hong-Goo Kang, Jan Skoglund, W. Bastiaan Kleijn, Andrew Storus, Hengchin Yeh "A High-Rate Extension to Soundstream" in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
145 International Conference Zhenyu Piao, Hyungseob Lim, Miseul Kim, Hong-goo Kang "PDF-NET: Pitch-adaptive Dynamic Filter Network for Intra-gender Speaker Verification" in APSIPA ASC, 2023
144 International Conference WooSeok Ko, Seyun Um, Zhenyu Piao, Hong-goo Kang "Consideration of Varying Training Lengths for Short-Duration Speaker Verification" in APSIPA ASC, 2023
143 International Conference Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang "BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0" in The IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), 2023
142 International Conference Seyun Um, Jihyun Kim, Jihyun Lee, Hong-Goo Kang "Facetron: A Multi-speaker Face-to-Speech Model based on Cross-Modal Latent Representations" in EUSIPCO, 2023
141 International Conference Hejung Yang, Hong-Goo Kang "Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement" in INTERSPEECH, 2023