Loading…

Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes

Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous appro...

Full description

Saved in:
Bibliographic Details
Main Authors: Backstrom, Tom, Helmrich, Christian R.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2015.7178948