Papers

A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement

International Conference
2016~2020
작성자
한혜원
작성일
2020-10-01 16:50
조회
382
Authors : Minh-Tri Ho, Jinyoung Lee, Bong-Ki Lee, Dong Hoon Yi, Hong-Goo Kang

Year : 2020

Publisher / Conference : INTERSPEECH

Related project : 극심한 잡음 환경에서의 음성 인식 성능향상을 위한 다채널 음질 개선 알고리즘 개발

Presentation : 구두

In this paper, we present a novel architecture for multi-channel speech enhancement using a cross-channel attention-based Wave-U-Net structure. Despite the advantages of utilizing spatial information as well as spectral information, it is challenging to effectively train a multi-channel deep learning system in an end-to-end framework.
With a channel-independent encoding architecture for spectral estimation and a strategy to extract spatial information through an inter-channel attention mechanism, we implement a multi-channel speech enhancement system that has high performance even in reverberant and extremely noisy environments.
Experimental results show that the proposed architecture has superior performance in terms of signal-to-distortion ratio improvement (SDRi), short-time objective intelligence (STOI), and phoneme error rate (PER) for speech recognition.
전체 326
306 International Conference Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang "LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis" in APSIPA, 2020
305 International Conference Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang "A Study on Conditional Features for a Flow-based Neural Vocoder" in Asilomar Conference on Signals, Systems, and Computers, 2020
304 International Conference Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang "FaceFilter: Audio-visual speech separation using still images" in INTERSPEECH (*awarded Best Student Paper), 2020
303 International Conference Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung "Seeing Voices and Hearing Voices: Learning Discriminative Embeddings Using Cross-Modal Self-Supervision" in INTERSPEECH, 2020
302 International Conference Hyewon Han, Soo-Whan Chung, Hong-Goo Kang "MIRNet: Learning multiple identities representations in overlapped speech" in INTERSPEECH, 2020
301 International Conference Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang "Intra-Class Variation Reduction of Speaker Representation in Disentanglement Framework" in INTERSPEECH, 2020
300 International Conference Minh-Tri Ho, Jinyoung Lee, Bong-Ki Lee, Dong Hoon Yi, Hong-Goo Kang "A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement" in INTERSPEECH, 2020
299 International Journal Young-Sun Joo, Hanbin Bae, Young-Ik Kim, Hoon-Young Cho, Hong-Goo Kang "Effective Emotion Transplantation in an End-to-End Text-to-Speech System" in IEEE Access, vol.8, pp.161713-161719, 2020
298 Domestic Journal 권유환, 정수환, 강홍구 "화자 인식을 위한 적대학습 기반음성 분리 프레임워크에 대한 연구" in 한국음향학회지, vol.39, 제 5호, pp.447-453, 2020
297 Domestic Conference 오태양, 정기혁, 강홍구 "화자 및 발화 스타일 임베딩을 통한 다화자 음성합성 시스템 음질 향상" in 전자공학회 하계학술대회, pp.980-982, 2020