Papers

Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement

International Journal
2021~
작성자
dsp
작성일
2022-11-09 14:37
조회
3757
Authors : Jinyoung Lee, Hong-Goo Kang

Year : 2022

Publisher / Conference : IEEE Signal Processing Letters

Volume : 29

Page : 2188-2192

Research area : Speech Signal Processing, Speech Enhancement

Presentation/Publication date : 17 October 2022

Presentation : None

In this letter, we propose a two-stage network for performing speech enhancement that predicts magnitude spectra in the first stage and complex spectra in the second stage. To maximize the model’s performance at each stage, we propose two convolutional modules: magnitude spectral masking (MSM) and complex spectra refinement (CSR). Each module is designed to take
into account the specific characteristics of the signal type it handles. The MSM estimates multiplicative masks to remove noise in the magnitude component of the convolutional features, and the CSR refines the complex component of the convolutional features using additive features. By using these modules, our proposed two-stage enhancement model shows higher performance than previously proposed state-of-the-art algorithms. In addition, the number of parameters of our model is only 2.63 million, and it can operate in real time thanks to its causal characteristics and low computational complexity.
전체 372
150 International Conference Hyungchan Yoon, Changhwan Kim, Eunwoo Song, Hyun-Wook Yoon, Hong-Goo Kang "Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech" in INTERSPEECH, 2023
149 International Conference Doyeon Kim, Soo-Whan Chung, Hyewon Han, Youna Ji, Hong-Goo Kang "HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders" in INTERSPEECH, 2023
148 Domestic Conference Jihyun Lee, Wootaek Lim, Hong-Goo Kang "음성 압축에서의 심층 신경망 기반 장구간 예측" in 한국방송·미디어공학회 2023년 하계학술대회, 2023
147 Domestic Conference Hwayeon Kim, Hong-Goo Kang "Band-Split based Dual-Path Convolution Recurrent Network for Music Source Separation" in 2023년도 한국음향학회 춘계학술발표대회 및 제38회 수중음향학 학술발표회, 2023
146 International Conference Zhenyu Piao, Miseul Kim, Hyungchan Yoon, Hong-Goo Kang "HappyQuokka System for ICASSP 2023 Auditory EEG Challenge" in ICASSP, 2023
145 International Conference Byeong Hyeon Kim, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Codec with Psychoacoustic Loss and Discriminator" in ICASSP, 2023
144 International Conference Hyungseob Lim, Jihyun Lee, Byeong Hyeon Kim, Inseon Jang, Hong-Goo Kang "End-to-End Neural Audio Coding in the MDCT Domain" in ICASSP, 2023
143 International Conference Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang "Style Modeling for Multi-Speaker Articulation-to-Speech" in ICASSP, 2023
142 International Journal Jinyoung Lee, Hong-Goo Kang "Real-Time Neural Speech Enhancement Based on Temporal Refinement Network and Channel-Wise Gating Methods" in Digital Signal Processing, vol.133, 2023
141 International Journal Taemin Kim, Yejee Shin, Kyowon Kang, Kiho Kim, Gwanho Kim, Yunsu Byeon, Hwayeon Kim, Yuyan Gao, Jeong Ryong Lee, Geonhui Son, Taeseong Kim, Yohan Jun, Jihyun Kim, Jinyoung Lee, Seyun Um, Yoohwan Kwon, Byung Gwan Son, Myeongki Cho, Mingyu Sang, Jongwoon Shin, Kyubeen Kim, Jungmin Suh, Heekyeong Choi, Seokjun Hong, Huanyu Cheng, Hong-Goo Kang, Dosik Hwang & Ki Jun Yu "Ultrathin crystalline-silicon-based strain gauges with deep learning algorithms for silent speech interfaces" in Nature Communications, vol.13, 2022