Scalable Multiband Binaural Renderer for MPEG-H 3D Audio

International Journal
2015-08-01 22:05
Authors : Taegyu Lee, Hyun Oh Oh, Jeongil Seo, Young-Cheol Park, Dae Hee Youn

Year : 2015

Publisher / Conference : IEEE Journal of Selected Topics in Signal Processing

Volume : 9, issue 5

Page : 907-920

To provide immersive 3D multimedia service, MPEG has launched MPEG-H, ISO/IEC 23008, “High Efficiency Coding and Media Delivery in Heterogeneous Environments.” As part of the audio, MPEG-H 3D Audio has been standardized based on a multichannel loudspeaker configuration (e.g., 22.2). Binaural rendering is a key application of 3D audio; however, previous studies focus on binaural rendering with low complexity such as IIR filter design for HRTF or pre-/post-processing to solve in-head localization or front-back confusion. In this paper, a new binaural rendering algorithm is proposed to support the large number of input channel signals and provide high-quality in terms of timbre, parts of this algorithm were adopted into the MPEG-H 3D Audio. The proposed algorithm truncates binaural room impulse response at mixing time, the transition point from the early-reflections to the late reverberation part. Each part is processed independently by variable order filtering in frequency domain (VOFF) and parametric late reverberation filtering (PLF), respectively. Further, a QMF domain tapped delay line (QTDL) is proposed to reduce complexity in the high-frequency band, based on human auditory perception and codec characteristics. In the proposed algorithm, a scalability scheme is adopted to cover a wide range of applications by adjusting the threshold of mixing time. Experimental results show that the proposed algorithm is able to provide the audio quality of a binaural rendered signal using full-length binaural room impulse responses. A scalability test also shows that the proposed scalability scheme smoothly compromises between audio quality and computational complexity.
전체 327
247 Domestic Conference 박규태, 박영철, 윤대희 "가변 가중치 곡선을 적용한 가상저음시스템" in 한국음향학회 춘계학술대회, 2016
246 Domestic Conference 양해민, 변경근, 강홍구 "RTCP를 이용한 심층 신경망 기반 음질평가 점수 대역 분별 알고리즘" in 한국음향학회 춘계학술대회, 2016
245 Domestic Conference 김글빛, 이진규, 강홍구 "문장종속 화자검증 시스템을 위한 비음수 행렬 분해 기반 잡음 제거" in 한국음향학회 춘계학술대회, 2016
244 Domestic Conference 김진섭, 주영선, 강홍구(연세대학교), 장인선, 안충현(한국전자통신연구원) "음향 모델 성능 개선을 위한 피치 동기화 기반의 DNN-TTS 시스템" in 한국음향학회 춘계학술대회, 2016
243 International Conference Hyeongi Moon, Gyutae Park, Yeong-cheol Park, Dae Hee Youn "A Phase-Matched Exponential Harmonic Weighting for Improved Sensation of Virtual Bass" in 140th Convention of Audio Engineering Society, pp.9544, 2016
242 International Conference Il-eun Kwak, Hong-Goo Kang "Robust formant features for speaker verification in the lombard effect" in APSIPA, pp.114-118, 2015
241 International Journal Ho Seon Shin, Tim Fingscheidt, Hong-Goo Kang "A Priori SNR Estimation Using Air- and Bone-Conduction Microphones" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue 11, pp.2015-2025, 2015
240 International Conference Hyeonjoo Kang, JeeSok Lee, Soonho Baek, Hong-Goo Kang "Systematic Integration of Acoustic Echo Canceller and Noise Reduction Modules for Voice Communication Systems" in INTERSPEECH, 2015
239 International Conference Kyungguen Byun, Eunwoo Song, Hong-goo Kang "A constrained two-layer compression technique for ECG waves" in Enegineering in Medicine and Biology Society (EMBC), 2015
238 International Conference Eunwoo Song, Hong-Goo Kang "Deep Neural Network-Based Statistical Parametric Speech Synthesis System Using Improved Time-Frequency Trajectory Excitation Mo" in INTERSPEECH, 2015