Papers
On Fine-Tuning Pre-Trained Speech Models With EMA-Target Self-Supervised Loss
International Conference
2021~
작성자
김병현
작성일
2023-12-14 16:35
조회
1406
However, fine-tuning can degrade the general knowledge that was originally built up by the pre-training, which could help prevent the model from overfitting given sparse fine-tuning data or bridge gaps between different domains.
We hypothesize that preserving this general knowledge in pre-trained models is crucial for improving performance on downstream tasks.
Based on this idea, we propose a novel method for fine-tuning self-supervised speech models that utilizes a self-supervised loss over the course of fine-tuning.
Then, an Exponential Moving Average (EMA) technique is applied to smoothly transition the domain of the model from the generalized to the task-oriented one.
We perform various downstream tasks using the proposed method, finding that our method improves performance on most of the tasks. Results show that our method induces the generalization ability of the model to be retained without overshadowing the downstream task performance.
전체 365
44 | International Conference | Doyeon Kim, Yanjue Song, Nilesh Madhu, Hong-Goo Kang "Enhancing Neural Speech Embeddings for Generative Speech Models" in APSIPA, 2024 | |
43 | Domestic Conference | 김병현, 강홍구, 장인선 "저지연 조건하의 심층신경망 기반 음성 압축" in 한국방송·미디어공학회 2024년 하계학술대회, 2024 | |
42 | International Conference | Miseul Kim, Soo-Whan Chung, Youna Ji, Hong-Goo Kang, Min-Seok Choi "Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation" in INTERSPEECH, 2024 | |
41 | International Conference | Woo-Jin Chung, Hong-Goo Kang "Speaker-Independent Acoustic-to-Articulatory Inversion through Multi-Channel Attention Discriminator" in INTERSPEECH, 2024 | |
40 | International Conference | Juhwan Yoon, Woo Seok Ko, Seyun Um, Sungwoong Hwang, Soojoong Hwang, Changhwan Kim, Hong-Goo Kang "UNIQUE : Unsupervised Network for Integrated Speech Quality Evaluation" in INTERSPEECH, 2024 | |
39 | International Conference | Yanjue Song, Doyeon Kim, Hong-Goo Kang, Nilesh Madhu "Spectrum-aware neural vocoder based on self-supervised learning for speech enhancement" in EUSIPCO, 2024 | |
38 | International Conference | Hyewon Han, Naveen Kumar "A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings" in Hands-free Speech Communication and Microphone Arrays (HSCMA, Satellite workshop in ICASSP), 2024 | |
37 | International Conference | Yanjue Song, Doyeon Kim, Nilesh Madhu, Hong-Goo Kang "On the Disentanglement and Robustness of Self-Supervised Speech Representations" in International Conference on Electronics, Information, and Communication (ICEIC) (*awarded Best Paper), 2024 | |
36 | International Conference | Yeona Hong, Miseul Kim, Woo-Jin Chung, Hong-Goo Kang "Contextual Learning for Missing Speech Automatic Speech Recognition" in International Conference on Electronics, Information, and Communication (ICEIC), 2024 | |
35 | International Conference | Juhwan Yoon, Seyun Um, Woo-Jin Chung, Hong-Goo Kang "SC-ERM: Speaker-Centric Learning for Speech Emotion Recognition" in International Conference on Electronics, Information, and Communication (ICEIC), 2024 | |