Combining Five Acoustic Level Modeling Methods for Automatic Speaker Age and Gender Recognition

International Conference
2010-09-26 23:51
Authors : Ming Li, Chi-Sang Jung, Kyu J. Han

Year : 2010

Publisher / Conference : INTERSPEECH

Page : 2826-2829

This paper presents a novel automatic speaker age and gender identification approach which combines five different methods at the acoustic level to improve the baseline performance. The five subsystems are (1) Gaussian mixture model (GMM) system based on mel-frequency cepstral coefficient (MFCC) features, (2) Support vector machine (SVM) based on GMM mean supervectors, (3) SVM based on GMM maximum likelihood linear regression (MLLR) matrix supervectors, (4) SVM based on GMM 'Tandem' supervectors, and (5) SVM baseline system based on the 450-dimensional feature vectors including prosodic features at the utterance level provided by the challenge organizing committee. To improve the overall classification performance, fusion of these five subsystems at the score level is performed. The proposed fusion system achieves 52.7% unweighted accuracy for the joint age-gender classification task and outperforms the GMM-MFCC system and SVM baseline, respectively, by 9.6% and 8.2% absolute improvement on the 2010 Interspeech Paralinguistic Challenge aGender database.
전체 332
54 International Conference Se-Woon Jeon, Young-cheol Park, Seok-Pil Lee, Dae Hee Youn "Virtual Source Panning using Multiple-Wise Vector Base in the Multispeaker Stereo Format" in EUSIPCO, pp.1337-1341, 2011
53 International Conference Dong-il Hyun, Jeongil Seo, Young-cheol Park, Dae Hee Youn "Improved phase parameter analysis and synthesis for parametric stereo audio coding" in ICASSP, 2011
52 International Conference Jeongook Song, Hyen-o Oh, Hong-Goo Kong "Enhanced long-term predictor for Unified Speech and Audio Coding" in ICASSP, 2011
51 International Conference Myung-Suk Song, Cha Zhang, Dinei Florencio, Hong-Goo Kang "Enhancing loudspeaker-based 3D audio with room modeling" in MMSP, 2010
50 International Conference Chi-Sang Jung, Kyu J. Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang "A Variable Frame Length and Rate Algorithm Based on the Spectral Kurtosis Measure for Speaker Verification" in INTERPSEECH, pp.2754-2757, 2010
49 International Conference Ming Li, Chi-Sang Jung, Kyu J. Han "Combining Five Acoustic Level Modeling Methods for Automatic Speaker Age and Gender Recognition" in INTERSPEECH, pp.2826-2829, 2010
48 International Conference Myung-Suk Song , Cha Zhang, Dinei Florencio, Hong-Goo Kang "Personal 3D audio system with loudspeakers" in ICME, 2010
47 International Conference Se-Woon Jeon, Young-Cheol Park, Seok-Pil Lee, Dae Hee Youn "Robust Representation of Spatial Sound in Stereo-to-Multichannel Upmix" in 128th Convention of Audio Engineering Society, pp.7976, 2010
46 International Conference Se-Woon Jeon, Dongil Hyun, Jeongil Seo, Young-Cheol Park, Dae Hee Youn "Enhancement of principal to ambient energy ratio for PCA-based parametric audio coding" in ICASSP, 2010
45 International Conference Ho Seon Shin, Min-Seok Choi, Taesu Kim, Hong-Goo Kang "Binaural loudness based speech reinforcement with a closed-form solution" in ICASSP, 2010