A Cross-channel Attention-based Wave-U-Net for Multi-channel Speech Enhancement
In this paper, we present a novel architecture for multi-channel speech enhancement using a cross-channel attention-based Wave-U-Net structure. Despite the advantages of utilizing spatial information as well as spectral information, it is challenging to effectively train a multi-channel deep learning system in an end-to-end framework.
With a channel-independent encoding architecture for spectral estimation and a strategy to extract spatial information through an inter-channel attention mechanism, we implement a multi-channel speech enhancement system that has high performance even in reverberant and extremely noisy environments.
Experimental results show that the proposed architecture has superior performance in terms of signal-to-distortion ratio improvement (SDRi), short-time objective intelligence (STOI), and phoneme error rate (PER) for speech recognition.
|327||International Journal||Jinyoung Lee, Hong-Goo Kang "Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement" in IEEE Signal Processing Letters, vol.29, pp.2188-2192, 2022|
|326||International Conference||Hyeon-Kyeong Shin, Hyewon Han, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang "Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting" in INTERSPEECH (*Best Student Paper Finalist), 2022|
|325||International Conference||Changhwan Kim, Se-yun Um, Hyungchan Yoon, Hong-goo Kang "FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS" in INTERSPEECH, 2022|
|324||International Conference||Miseul Kim, Zhenyu Piao, Seyun Um, Ran Lee, Jaemin Joh, Seungshin Lee, Hong-Goo Kang "Light-Weight Speaker Verification with Global Context Information" in INTERSPEECH, 2022|
|323||International Journal||Kyungguen Byun, Se-yun Um, Hong-Goo Kang "Length-Normalized Representation Learning for Speech Signals" in IEEE Access, vol.10, pp.60362-60372, 2022|
|322||International Conference||Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang "Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement" in ICASSP, 2022|
|321||International Conference||Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Coding with Guided References" in ICASSP, 2022|
|320||International Conference||Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang "Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator" in ICASSP, 2022|
|319||International Conference||Jinyoung Lee and Hong-Goo Kang "Stacked U-Net with High-level Feature Transfer for Parameter Efficient Speech Enhancement" in APSIPA ASC, 2021|
|318||International Conference||Huu-Kim Nguyen, Kihyuk Jeong, Se-Yun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "LiteTTS: A Decoder-free Light-weight Text-to-wave Synthesis Based on Generative Adversarial Networks" in INTERSPEECH, 2021|