Loading…

Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum

The limited audio bandwidth used in narrowband telephone systems degrades both the quality and the intelligibility of speech. This paper presents a new method for the bandwidth extension of telephone speech. Frequency components are added to the frequency band 4-8 kHz using only the information in t...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2011-09, Vol.19 (7), p.2170-2183
Main Authors: Pulakka, H., Alku, P.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The limited audio bandwidth used in narrowband telephone systems degrades both the quality and the intelligibility of speech. This paper presents a new method for the bandwidth extension of telephone speech. Frequency components are added to the frequency band 4-8 kHz using only the information in the narrowband speech. A neural network is used to estimate the mel spectrum in the extension band in short time frames based on features calculated from the narrowband speech. A wideband excitation signal is generated by spectral folding from the narrowband linear prediction residual and a filter bank is utilized to divide the excitation into four sub-bands that cover the extension band. These sub-bands are weighted such that the estimated mel spectrum is realized. Bandwidth-extended speech is obtained by summing the weighted sub-bands and the original narrowband signal. Listening tests show that this new method improves speech quality compared with narrowband telephone speech and with a previously published bandwidth extension method.
ISSN:1558-7916
2329-9290
1558-7924
2329-9304
DOI:10.1109/TASL.2011.2118206