Papers

Disentangled Representations for Arabic Dialect Identification based on Supervised Clustering with Triplet Loss

International Conference
2021~
작성자
한혜원
작성일
2021-08-30 14:38
조회
1910
Authors : Zainab Alhakeem, Yoohwan Kwon, Hong-Goo Kang

Year : 2021

Publisher / Conference : EUSIPCO

Research area : Speech Signal Processing, 기타

In this paper, we propose a novel supervised clustering with triplet (SCT) loss that effectively learns disentangled representations for Arabic dialect identification (ADI). To improve the performance of ADI using latent representation-based approaches, we need to extract embeddings that include only dialect related information by dissociating all the irrelevant information such as gender, channel, and speaker. In consideration of the embedding-level distribution, our proposed SCT loss minimizes intra-class variations and maximizes inter-class variations. Specifically, it uses the centroid of each dialect as a triplet component, thereby avoiding the issue of choosing an undesirable triplet component due to random sampling. Experimental results on the ADI-17 dataset show that our proposed method significantly outperforms conventional state-of-the-art methods in terms of the identification accuracy.
전체 355
122 International Conference Miseul Kim, Minh-Tri Ho, Hong-Goo Kang "Self-supervised Complex Network for Machine Sound Anomaly Detection" in EUSIPCO, 2021
121 International Conference Kihyuk Jeong, Huu-Kim Nguyen, Hong-Goo Kang "A Fast and Lightweight Text-To-Speech Model with Spectrum and Waveform Alignment Algorithms" in EUSIPCO, 2021
120 International Conference Jiyoung Lee*, Soo-Whan Chung*, Sunok Kim, Hong-Goo Kang**, Kwanghoon Sohn** "Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation" in CVPR, 2021
119 International Conference Zainab Alhakeem, Hong-Goo Kang "Confidence Learning from Noisy Labels for Arabic Dialect Identification" in ITC-CSCC, 2021
118 International Conference Huu-Kim Nguyen, Kihyuk Jeong, Hong-Goo Kang "Fast and Lightweight Speech Synthesis Model based on FastSpeech2" in ITC-CSCC, 2021
117 International Conference Yoohwan Kwon*, Hee-Soo Heo*, Bong-Jin Lee, Joon Son Chung "The ins and outs of speaker recognition: lessons from VoxSRC 2020" in ICASSP, 2021
116 International Conference You Jin Kim, Hee Soo Heo, Soo-Whan Chung, Bong-Jin Lee "End-to-end Lip Synchronisation Based on Pattern Classification" in IEEE Spoken Language Technology Workshop (SLT), 2020
115 International Conference Seong Min Kye, Yoohwan Kwon, Joon Son Chung "Cross Attentive Pooling for Speaker Verification" in IEEE Spoken Language Technology Workshop (SLT), 2020
114 International Conference Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis" in APSIPA (*awarded Best Paper), 2020
113 International Conference Hyeon-Kyeong Shin, Hyewon Han, Kyungguen Byun, Hong-Goo Kang "Speaker-invariant Psychological Stress Detection Using Attention-based Network" in APSIPA, 2020