Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition

Domestic Journal
2010-08-01 01:11
Authors : Myung-Suk Song, Chang-Heon Lee, Seok-Pil Lee, Hong-Goo Kang

Year : 2010

Publisher / Conference : 한국음향학회지

Volume : 29, 제 2호

Page : 86-99

This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.
전체 345
138 International Conference Myung-Suk Song, Cha Zhang, Dinei Florencio, Hong-Goo Kang "Enhancing loudspeaker-based 3D audio with room modeling" in MMSP, 2010
137 International Journal Dong-il Hyun, Donggeum Lee, Youngcheol Park, Dae Hee Youn, Jeongil Seo "Joint Channel Coding Based on Principal Component Analysis" in ETRI Journal, vol.32, issue 5, pp.831-834, 2010
136 International Conference Chi-Sang Jung, Kyu J. Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang "A Variable Frame Length and Rate Algorithm Based on the Spectral Kurtosis Measure for Speaker Verification" in INTERPSEECH, pp.2754-2757, 2010
135 International Conference Ming Li, Chi-Sang Jung, Kyu J. Han "Combining Five Acoustic Level Modeling Methods for Automatic Speaker Age and Gender Recognition" in INTERSPEECH, pp.2826-2829, 2010
134 Domestic Journal 송정욱, 오현오, 강홍구 "통합 음성/오디오 부호화를 위한 새로운 MPEG 참조 모델" in 전자공학회논문지, vol.47 SP, 제 5호, pp.74-80, 2010
133 Domestic Journal 전세운, 박영철, 윤대희 "다채널 포맷 변환과 공간적인 입체 음향 정보의 효과적인 유지에 대한 연구" in 전자공학회논문지, vol.47 SP, 제 5호, pp.34-44, 2010
132 Domestic Journal 오현오, 정양원 "객체 오디오 부호화 표준 SAOC 기술 및 응용" in 전자공학회논문지, vol.47 SP, 제 5호, pp.45-55, 2010
131 Domestic Conference 서현선, 정치상, 강홍구 "음소 특성 기반 스코어의 퓨전 방식을 이용한 서포트 벡터 머신 기반 화자 검증 시스템" in 한국음향학회, 2010
130 Domestic Conference 신호선, 최가원, 강홍구 "잡음 환경에서의 SNR 회복 기법을 적용한 음성 향상 알고리즘을 이용한 감정인식" in 음성통신 및 신호처리학술대회, vol.27, no. 1, 2010
129 International Journal Chi-Sang Jung, Moo Young Kim, Hong-Goo Kang "Selecting Feature Frames for Automatic Speaker Recognition Using Mutual Information" in IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue 6, pp.1332-1340, 2010