Papers

Stacked U-Net with High-level Feature Transfer for Parameter Efficient Speech Enhancement

International Conference
2021~
작성자
김화연
작성일
2021-09-01 14:48
조회
1793
Authors : Jinyoung Lee and Hong-Goo Kang

Year : 2021

Publisher / Conference : APSIPA ASC

Research area : Speech Signal Processing, Speech Enhancement

In this paper, we present a stacked U-Net structure-based speech enhancement algorithm with parameter reduction and real-time processing. To significantly reduce the number of network parameters, we propose a stacked structure in which several shallow U-Nets with fewer convolutional layer channels are cascaded. However, simply stacking the small-scale U-Nets cannot sufficiently compensate for the performance loss caused by the lack of parameters. To overcome this problem, we propose a high-level feature transfer method that passes all the multi-channel output features, which are obtained before passing through the intermediate output layer, to the next stage.Furthermore, our proposed model can process analysis frames with short lengths because its down-sampling and up-sampling blocks are much smaller than the conventional Wave U-Net method; theses smaller layers make our proposed model suitable for low-delay online processing. Experiments show that our proposed method outperforms the conventional Wave U-Net method on almost all objective measures and requires only 7.21%of the network parameters when compared to the conventional method. In addition, our model can be successfully implemented in real time on both GPU and CPU environments.
전체 355
2 International Conference Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Coding with Guided References" in ICASSP, 2022
1 International Conference Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang "Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator" in ICASSP, 2022