Papers

A Study on Conditional Features for a Flow-based Neural Vocoder

International Conference
2016~2020
작성자
한혜원
작성일
2020-11-01 17:03
조회
1689
Authors : Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang

Year : 2020

Publisher / Conference : Asilomar Conference on Signals, Systems, and Computers

In this paper, we propose an effective way of providing conditional features for a flow-based neural vocoder. Most conventional approaches utilize mel-spectrograms for conditioning neural vocoders, but this significantly increases the size of neural networks due to their high dimensional behavior. We show that the network size of a flow-based generative model can be reduced when we use acoustic parameters for a sinusoidal speech analysis-and-synthesis framework such as voiced/unvoiced flag, fundamental frequency, mel-cepstral coefficients, and energy of each analysis frame. We also conclude that training becomes much easier if we feed the fundamental frequency by an embedded vector representation after quantizing it with a small number of bits. Experimental results verify that the performance of the proposed algorithm is comparable to that of flow-based neural vocoders conditioned on mel-spectrograms while the required information for the feature representations and network complexity for generating speech become lower.
전체 356
8 International Conference Zhenyu Piao, Hyungseob Lim, Miseul Kim, Hong-goo Kang "PDF-NET: Pitch-adaptive Dynamic Filter Network for Intra-gender Speaker Verification" in APSIPA ASC, 2023
7 International Conference Byeong Hyeon Kim, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Codec with Psychoacoustic Loss and Discriminator" in ICASSP, 2023
6 International Conference Hyungseob Lim, Jihyun Lee, Byeong Hyeon Kim, Inseon Jang, Hong-Goo Kang "End-to-End Neural Audio Coding in the MDCT Domain" in ICASSP, 2023
5 Domestic Conference Hyungseob Lim, Hong-Goo Kang, Inseon Jang "엔트로피 모델을 활용한 심층 신경망 기반 오디오 압축 모델 최적화" in 한국방송·미디어공학회 2022년 하계학술대회, 2022
4 International Conference Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Coding with Guided References" in ICASSP, 2022
3 International Conference Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang "Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator" in ICASSP, 2022
2 International Conference Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis" in APSIPA (*awarded Best Paper), 2020
1 International Conference Hyungseob Lim, Suhyeon Oh, Kyungguen Byun, Hong-Goo Kang "A Study on Conditional Features for a Flow-based Neural Vocoder" in Asilomar Conference on Signals, Systems, and Computers, 2020