Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment

International Conference
2014-05-01 00:42
Authors : Soonho Baek, Hong-Goo Kang

Year : 2014

Publisher / Conference : ICASSP

This paper presents the effect of mean normalization to various types of cepstral coefficients for robust speech recognition in noisy environments. Although the cepstral mean normalization (CMN) technique was originally designed to compensate channel distortion, it has also been proved that the CMN also improves recognition accuracy in additive noisy environment. However, no one has yet considered the interaction of CMN with spectral mapping functions required for extracting cepstral features. This paper investigates the impact of CMN to the speech recognition system depending on the types of spectral mapping function by mathematically analyzing the amount of spectral distortion between clean and noisy conditions. The analytic result is also confirmed by comparing the type of recognition error patterns in automatic speech recognition experiment with Aurora 2 database. Experimental results show that the performance improvement by adopting CMN becomes significant if the logarithmic function is replaced with the appropriate setting of fractional power mapping function. Especially, the deletion errors are dramatically reduced.
전체 327
237 International Journal Taegyu Lee, Hyun Oh Oh, Jeongil Seo, Young-Cheol Park, Dae Hee Youn "Scalable Multiband Binaural Renderer for MPEG-H 3D Audio" in IEEE Journal of Selected Topics in Signal Processing, vol.9, issue 5, pp.907-920, 2015
236 International Conference Heejin Ahn, Eunwoo Song, Won-Suk Jun, Hong-goo Kang "A Compression Algorithms for Hidden Markov Model-Based Speech Synthesis Systems" in ITC-CSCC, pp.942-945, 2015
235 International Conference JeeSok Lee, Sejin Oh, Hong-Goo Kang "Coherent channel based subband multichannel dereverberation" in ICASSP, pp.2704-2708, 2015
234 International Conference Eunwoo Song, Young-Sun Joo, Hong-Goo Kang "Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system" in ICASSP, 2015
233 Domestic Journal 박영철, 이태규, 윤대희 "MPEG-H 3D 오디오 바이노럴 렌더링 기술 표준화" in 대한전기학회, 전기의 세계, vol.64, 제 2호, pp.27-31, 2015
232 International Journal Taegyu Lee, Yonghyun Baek, Young-Cheol Park, Dae Hee Youn "Stereo upmix-based binaural auralization for mobile devices" in IEEE Transactions on Consumer Electronics, vol.60, issue 3, pp.411-419, 2014
231 International Conference Eunwoo Song, Hong-Goo Kang, Joonil Lee "Fixed-point implementation of MPEG-D unified speech and audio coding decoder" in 19th International Conference on Digital Signal Processing (DSP), pp.110-113, 2014
230 International Journal Soonho Baek, Hong-Goo Kang "Selection of spectral compressive operator for vector Taylor series-based model adaptation in noisy environments" in The Journal of the Acoustical Society of America, vol.135, 2014
229 International Conference Soonho Baek, Hong-Goo Kang "Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment" in ICASSP, 2014
228 International Journal Jae-Mo Yang, Hong-Goo Kang "Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue 3, pp.608-619, 2014