Loading…

Perceptual audio coding using adaptive pre- and post-filters and lossless compression

This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and r...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on speech and audio processing 2002-09, Vol.10 (6), p.379-390
Main Authors: Schuller, G.D.T., Bin Yu, Dawei Huang, Edler, B.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3
cites cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3
container_end_page 390
container_issue 6
container_start_page 379
container_title IEEE transactions on speech and audio processing
container_volume 10
creator Schuller, G.D.T.
Bin Yu
Dawei Huang
Edler, B.
description This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.
doi_str_mv 10.1109/TSA.2002.803444
format article
fullrecord <record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_13973497</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1040262</ieee_id><sourcerecordid>28533269</sourcerecordid><originalsourceid>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</originalsourceid><addsrcrecordid>eNp9kc1LxDAQxYsoqKtnD16KoHjpbr6apkcRv0BQcD2HNJlIpNvWTCv435t1BcWDl8kw_N6bDC_LjiiZU0rqxfLpYs4IYXNFuBBiK9ujZakKxku-nXoieSFlJXezfcRXQoiildjLnh8hWhjGybS5mVzoc9u70L3kE66rcWYYwzvkQ4QiN53Lhx7Hwod2hIhfg7ZHbAExCVeJQgx9d5DteNMiHH6_s-z5-mp5eVvcP9zcXV7cF1YIORbOON-U3jhgprSuYd7zpkk_JQ4MFWkgGEDlLRBeC1BOVrUztbWNsqJpGj7Lzja-Q-zfJsBRrwJaaFvTQT-hZqrknMk6gef_glRWlAslRJXQkz_oaz_FLp2hlRK8oolK0GID2ZjOj-D1EMPKxA9NiV7HoVMceh2H3sSRFKfftgataX00nQ34I-N1xUW9Xn-84QIA_HIVhEnGPwFr7ZT0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884371844</pqid></control><display><type>article</type><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</creator><creatorcontrib>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</creatorcontrib><description>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2002.803444</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Adaptive filters ; Applied sciences ; Artificial intelligence ; Audio coding ; Coders ; Coding ; Compression ratio ; Computer science; control theory; systems ; Data compression ; Decoding ; Delay ; Encoding ; Exact sciences and technology ; Information, signal and communications theory ; Lossless ; Multiple signal classification ; Predictive coding ; Psychology ; Reduction ; Redundancy ; Sampling methods ; Signal processing ; Speech ; Speech and sound recognition and synthesis. Linguistics ; Speech processing ; Telecommunications and information theory</subject><ispartof>IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.379-390</ispartof><rights>2003 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2002</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</citedby><cites>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1040262$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=13973497$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Schuller, G.D.T.</creatorcontrib><creatorcontrib>Bin Yu</creatorcontrib><creatorcontrib>Dawei Huang</creatorcontrib><creatorcontrib>Edler, B.</creatorcontrib><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</description><subject>Adaptive filters</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Audio coding</subject><subject>Coders</subject><subject>Coding</subject><subject>Compression ratio</subject><subject>Computer science; control theory; systems</subject><subject>Data compression</subject><subject>Decoding</subject><subject>Delay</subject><subject>Encoding</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Lossless</subject><subject>Multiple signal classification</subject><subject>Predictive coding</subject><subject>Psychology</subject><subject>Reduction</subject><subject>Redundancy</subject><subject>Sampling methods</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Speech processing</subject><subject>Telecommunications and information theory</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><recordid>eNp9kc1LxDAQxYsoqKtnD16KoHjpbr6apkcRv0BQcD2HNJlIpNvWTCv435t1BcWDl8kw_N6bDC_LjiiZU0rqxfLpYs4IYXNFuBBiK9ujZakKxku-nXoieSFlJXezfcRXQoiildjLnh8hWhjGybS5mVzoc9u70L3kE66rcWYYwzvkQ4QiN53Lhx7Hwod2hIhfg7ZHbAExCVeJQgx9d5DteNMiHH6_s-z5-mp5eVvcP9zcXV7cF1YIORbOON-U3jhgprSuYd7zpkk_JQ4MFWkgGEDlLRBeC1BOVrUztbWNsqJpGj7Lzja-Q-zfJsBRrwJaaFvTQT-hZqrknMk6gef_glRWlAslRJXQkz_oaz_FLp2hlRK8oolK0GID2ZjOj-D1EMPKxA9NiV7HoVMceh2H3sSRFKfftgataX00nQ34I-N1xUW9Xn-84QIA_HIVhEnGPwFr7ZT0</recordid><startdate>20020901</startdate><enddate>20020901</enddate><creator>Schuller, G.D.T.</creator><creator>Bin Yu</creator><creator>Dawei Huang</creator><creator>Edler, B.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20020901</creationdate><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><author>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Adaptive filters</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Audio coding</topic><topic>Coders</topic><topic>Coding</topic><topic>Compression ratio</topic><topic>Computer science; control theory; systems</topic><topic>Data compression</topic><topic>Decoding</topic><topic>Delay</topic><topic>Encoding</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Lossless</topic><topic>Multiple signal classification</topic><topic>Predictive coding</topic><topic>Psychology</topic><topic>Reduction</topic><topic>Redundancy</topic><topic>Sampling methods</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Speech processing</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Schuller, G.D.T.</creatorcontrib><creatorcontrib>Bin Yu</creatorcontrib><creatorcontrib>Dawei Huang</creatorcontrib><creatorcontrib>Edler, B.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Schuller, G.D.T.</au><au>Bin Yu</au><au>Dawei Huang</au><au>Edler, B.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2002-09-01</date><risdate>2002</risdate><volume>10</volume><issue>6</issue><spage>379</spage><epage>390</epage><pages>379-390</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><abstract>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2002.803444</doi><tpages>12</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1063-6676
ispartof IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.379-390
issn 1063-6676
2329-9290
1558-2353
2329-9304
language eng
recordid cdi_pascalfrancis_primary_13973497
source IEEE Electronic Library (IEL) Journals
subjects Adaptive filters
Applied sciences
Artificial intelligence
Audio coding
Coders
Coding
Compression ratio
Computer science
control theory
systems
Data compression
Decoding
Delay
Encoding
Exact sciences and technology
Information, signal and communications theory
Lossless
Multiple signal classification
Predictive coding
Psychology
Reduction
Redundancy
Sampling methods
Signal processing
Speech
Speech and sound recognition and synthesis. Linguistics
Speech processing
Telecommunications and information theory
title Perceptual audio coding using adaptive pre- and post-filters and lossless compression
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A30%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Perceptual%20audio%20coding%20using%20adaptive%20pre-%20and%20post-filters%20and%20lossless%20compression&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Schuller,%20G.D.T.&rft.date=2002-09-01&rft.volume=10&rft.issue=6&rft.spage=379&rft.epage=390&rft.pages=379-390&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2002.803444&rft_dat=%3Cproquest_pasca%3E28533269%3C/proquest_pasca%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=884371844&rft_id=info:pmid/&rft_ieee_id=1040262&rfr_iscdi=true