Papers

Light-Weight Speaker Verification with Global Context Information

International Conference
작성자
dsp
작성일
2022-06-16 17:03
조회
2143
Authors : Miseul Kim, Zhenyu Piao, Seyun Um, Ran Lee, Jaemin Joh, Seungshin Lee, Hong-Goo Kang

Year : 2022

Publisher / Conference : INTERSPEECH

Research area : Speech Signal Processing, Speaker Recognition

Presentation : Poster

In this paper, we propose a light-weight speaker verification (SV) system that utilizes the characteristics of utterance-level global features.
Many recent SV tasks employ convolutional neural networks (CNNs) to extract representative speaker features from the given input utterances. However, their inherent receptive field size on the feature extraction process is limited by the localized structure of the convolutional layers.
To effectively extract utterance-level global speaker representations, we introduce a novel architecture combining a CNN with a self-attention network that is able to utilize the relationship between local and aggregated global features. The global features are continuously updated at every analysis block using a point-wise attentive summation to the local features.
We also adopt a densely connected CNN structure (DenseNet) to reliably estimate speaker-related local features with a small number of model parameters. Our proposed model shows higher speaker verification performance with EER 1.935% with significantly small number of parameters, 1.2M, which is 16% reduced model size than the baseline models.
전체 367
327 Domestic Conference Hyungseob Lim, Hong-Goo Kang, Inseon Jang "엔트로피 모델을 활용한 심층 신경망 기반 오디오 압축 모델 최적화" in 한국방송·미디어공학회 2022년 하계학술대회, 2022
326 International Conference Hyeon-Kyeong Shin, Hyewon Han, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang "Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting" in INTERSPEECH (*Best Student Paper Finalist), 2022
325 International Conference Changhwan Kim, Seyun Um, Hyungchan Yoon, Hong-goo Kang "FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS" in INTERSPEECH, 2022
324 International Conference Miseul Kim, Zhenyu Piao, Seyun Um, Ran Lee, Jaemin Joh, Seungshin Lee, Hong-Goo Kang "Light-Weight Speaker Verification with Global Context Information" in INTERSPEECH, 2022
323 International Journal Kyungguen Byun, Seyun Um, Hong-Goo Kang "Length-Normalized Representation Learning for Speech Signals" in IEEE Access, vol.10, pp.60362-60372, 2022
322 International Conference Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang "Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement" in ICASSP, 2022
321 International Conference Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Coding with Guided References" in ICASSP, 2022
320 International Conference Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang "Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator" in ICASSP, 2022
319 International Conference Jinyoung Lee and Hong-Goo Kang "Stacked U-Net with High-level Feature Transfer for Parameter Efficient Speech Enhancement" in APSIPA ASC, 2021
318 International Conference Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "LiteTTS: A Decoder-free Light-weight Text-to-wave Synthesis Based on Generative Adversarial Networks" in INTERSPEECH, 2021