Speaker-invariant Psychological Stress Detection Using Attention-based Network

International Conference
2020-12-01 16:58
Authors : Hyeon-Kyeong Shin, Hyewon Han, Kyungguen Byun, Hong-Goo Kang

Year : 2020

Publisher / Conference : APSIPA

Presentation/Publication date : 2020.12.08

Related project : 상대방의 감성을 추론, 판단하여 그에 맞추어 대화하고 대응할 수 있는 감성지능 기술 연구개발 (5/5)

Presentation : Oral

When people get stressed in nervous or unfamiliar situations, their speaking styles or acoustic characteristics change. These changes are particularly emphasized in certain regions of speech, so a model that automatically computes temporal weights for components of the speech signals that reflect stress-related information can effectively capture the psychological state of the speaker. In this paper, we propose an algorithm for psychological stress detection from speech signals using a deep spectral-temporal encoder and multi-head attention with domain adversarial training. To detect long-term variations and spectral relations in the speech under different stress conditions, we build a network by concatenating a convolutional neural network (CNN) and a recurrent neural network (RNN). Then, multi-head attention is utilized to further emphasize stress-concentrated regions. For speaker-invariant stress detection, the network is trained with adversarial multi-task learning by adding a gradient reversal layer. We show the robustness of our proposed algorithm in stress classification tasks on the Multimodal Korean stress database acquired in [1] and the authorized stress database Speech Under Simulated and Actual Stress (SUSAS) [2]. In addition, we demonstrate the effectiveness of multi-head attention and domain adversarial training with visualized analysis using the t-SNE method.
전체 355
325 International Conference Changhwan Kim, Seyun Um, Hyungchan Yoon, Hong-goo Kang "FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS" in INTERSPEECH, 2022
324 International Conference Miseul Kim, Zhenyu Piao, Seyun Um, Ran Lee, Jaemin Joh, Seungshin Lee, Hong-Goo Kang "Light-Weight Speaker Verification with Global Context Information" in INTERSPEECH, 2022
323 International Journal Kyungguen Byun, Seyun Um, Hong-Goo Kang "Length-Normalized Representation Learning for Speech Signals" in IEEE Access, vol.10, pp.60362-60372, 2022
322 International Conference Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang "Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement" in ICASSP, 2022
321 International Conference Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang "Progressive Multi-Stage Neural Audio Coding with Guided References" in ICASSP, 2022
320 International Conference Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang "Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator" in ICASSP, 2022
319 International Conference Jinyoung Lee and Hong-Goo Kang "Stacked U-Net with High-level Feature Transfer for Parameter Efficient Speech Enhancement" in APSIPA ASC, 2021
318 International Conference Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang "LiteTTS: A Decoder-free Light-weight Text-to-wave Synthesis Based on Generative Adversarial Networks" in INTERSPEECH, 2021
317 International Conference Zainab Alhakeem, Yoohwan Kwon, Hong-Goo Kang "Disentangled Representations for Arabic Dialect Identification based on Supervised Clustering with Triplet Loss" in EUSIPCO, 2021
316 International Conference Miseul Kim, Minh-Tri Ho, Hong-Goo Kang "Self-supervised Complex Network for Machine Sound Anomaly Detection" in EUSIPCO, 2021