Papers

Progressive Multi-Stage Neural Audio Coding with Guided References

International Conference
2021~
작성자
고우석
작성일
2022-01-24 18:47
조회
4195
Authors : Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang

Year : 2022

Publisher / Conference : ICASSP

Research area : Audio Signal Processing, Coding

Related project : 생성모델 기반 음향압축 기술 연구(2/5)

Presentation : Poster

In this paper, we propose an effective multi-stage neural audio coding algorithm that encodes full-band audio signals (up to 20 kHz) using an end-to-end training criterion. By pre defining several dyadic subband signals as training targets, we progressively encode input audio signals in each stage such that deeper stages of the network encode the residual error terms from the previous encoding stage. Our proposed audio codec successfully decodes full-band audio signals by using an effective multi-stage vector quantization scheme to represent key encoding features extracted in the latent space. Subjective listening tests show that the decoded outputs of the proposed audio codec are indistinguishable from the uncoded inputs at an average bitrate of 136 kbps.
전체 372
39 International Conference Byeong Hyeon Kim,Hyungseob Lim,Inseon Jang,Hong-Goo Kang "Towards an Ultra-Low-Delay Neural Audio Coding with Computational Efficiency" in INTERSPEECH, 2025
38 International Conference Stijn Kindt,Jihyun Kim,Hong-Goo Kang,Nilesh Madhu "Efficient, Cluster-Informed, Deep Speech Separation with Cross-Cluster Information in AD-HOC Wireless Acoustic Sensor Networks" in International Workshop on Acoustic Signal Enhancement (IWAENC), 2024
37 International Conference Yeona Hong, Hyewon Han, Woo-jin Chung, Hong-Goo Kang "StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models" in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
36 International Conference Sangmin Lee, Woojin Chung, Hong-Goo Kang "LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration" in Association for the Advancement of Artificial Intelligence (AAAI), 2025
35 International Conference Juhwan Yoon, Hyungseob Lim, Hyeonjin Cha, Hong-Goo Kang "StylebookTTS: Zero-Shot Text-to-Speech Leveraging Unsupervised Style Representation" in APSIPA ASC, 2024
34 International Conference Doyeon Kim, Yanjue Song, Nilesh Madhu, Hong-Goo Kang "Enhancing Neural Speech Embeddings for Generative Speech Models" in APSIPA ASC, 2024
33 International Conference Miseul Kim, Soo-Whan Chung, Youna Ji, Hong-Goo Kang, Min-Seok Choi "Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation" in INTERSPEECH, 2024
32 International Conference Woo-Jin Chung, Hong-Goo Kang "Speaker-Independent Acoustic-to-Articulatory Inversion through Multi-Channel Attention Discriminator" in INTERSPEECH, 2024
31 International Conference Juhwan Yoon, Woo Seok Ko, Seyun Um, Sungwoong Hwang, Soojoong Hwang, Changhwan Kim, Hong-Goo Kang "UNIQUE : Unsupervised Network for Integrated Speech Quality Evaluation" in INTERSPEECH, 2024
30 International Conference Yanjue Song, Doyeon Kim, Hong-Goo Kang, Nilesh Madhu "Spectrum-aware neural vocoder based on self-supervised learning for speech enhancement" in EUSIPCO, 2024