Abstract

The perceptual relevance of adopting the temporal envelope to model the frequency band of 4–7  kHz (highband) in wideband speech signal is described in this letter. Based on theoretical work in psychoacoustics, we find out that the temporal envelope can indeed be a perceptual cue for the high-band signal, i.e., a noiseless sound can be obtained if the temporal envelope is roughly preserved. Subjective listening tests verify that transparent quality can be obtained if the model is used for the 4.5–7  kHz band. The proposed model has the benefits of offering flexible scalability and reducing the cost for quantization in coding applications.