Papers

Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus

International Journal
~2005
작성자
이진영
작성일
2005-07-01 13:52
조회
3337
Authors : Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole, Sarah Borys, Sung-Suk Kim, Aaron Cohen, Tong Zhang, Jeung-Yoon Choi, Heejin Kim, Taejin Yoon, Sandra Chavarria

Year : 2005

Publisher / Conference : Speech Communication

Volume : 46, issue 3-4

Page : 418-439

This paper describes automatic speech recognition systems that satisfy two technological objectives. First, we seek to improve the automatic labeling of prosody, in order to aid future research in automatic speech understanding. Second, we seek to apply statistical speech recognition models of prosody for the purpose of reducing the word error rate of an automatic speech recognizer. The systems described in this paper are variants of a core dynamic Bayesian network model, in which the key hidden variables are the word, the prosodic tag sequence, and the prosody-dependent allophones. Statistical models of the interaction among words and prosodic tags are trained using the Boston University Radio Speech Corpus, a database annotated using the tones and break indices (ToBI) prosodic annotation system. This paper presents both theoretical and empirical results in support of the conclusion that a prosody-dependent speech recognizer—a recognizer that simultaneously computes the most-probable word labels and prosodic tags—can provide lower word recognition error rates than a standard prosody-independent speech recognizer in a multi-speaker speaker-dependent speech recognition task on radio speech.
전체 372
72 International Conference Sun-kuk Moon, Tack-sung Choi, Young-Cheol Park, Dae Hee Youn "An Efficient Feature Selection Algorithm Based on Kullback-Leibler Divergence for Music Information Retrieval" in ITC-CSCC, pp.524-525, 2007
71 International Conference Min-Ki Lee, Sung-Wan Youn, Kyung-Tae Kim, Hong-Goo Kang "Speech Quality Degradation in Packet Loss Environment at Specific Speech Class" in ITC-CSCC, pp.781-782, 2007
70 International Conference Chi-Sang Jung, Bong-Jin Lee, Jeung-Yoon Choi, Hong-Goo Kang "AN ADAPTIVE SELECTION OF FRAME SHIFT FOR SPEAKER RECOGNITION SYSTEMS" in The Second Beijing-Hong Kong International Doctoral Forum, 2007
69 International Conference Min-Seok Choi, Young-Cheol Park, Hong-Goo Kang "A Soft-Decision Adaptation Mode Controller for an Efficient Frequency-Domain Generalized Sidelobe Canceller" in ICASSP, 2007
68 International Conference Yoomi Hur, Young-Choel Park, Dae Hee Youn "A New Structure for Stereo Acoustic Echo Cancellation Based on Binaural Cue Coding" in 122th Convention of Audio Engineering Society, pp.7094, 2007
67 Domestic Conference 정치상, 이봉진, 최정윤, 강홍구, 윤대희 "화자인식 시스템을 위한 다양한 프레임 이동 길이 선택 방법에 관한 연구" in 2007년도 하계종합학술발표회, 2007
66 Domestic Conference 문선국, 최택성, 박영철, 윤대희 "다양한 해상도의 필터뱅크에 따른 음악 장르 분류를 위한 특징벡터의 성능 비교" in 2007년도 하계종합학술발표회, 2007
65 Domestic Conference 최택성, 문선국, 박영철 "Relative Specific Loudness Histogram 특징벡터를 이용한 음악 장르 분류" in 2007년도 하계종합학술발표회, 2007
64 Domestic Conference 김정근, 박영철, 윤대희 "정현파 모델을 사용한 낮은 전송률에서의 HE-AAC 복원방법" in 2007년도 하계종합학술발표회, 2007
63 Domestic Conference 이동금, 박영철, 윤대희 "저 전송율 MPEG AAC 에서 복호화되는 오디오 음질의 개선에 대한 연구" in 2007년도 하계종합학술발표회, 2007