Loading…
Perceptual audio coding using adaptive pre- and post-filters and lossless compression
This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and r...
Saved in:
Published in: | IEEE transactions on speech and audio processing 2002-09, Vol.10 (6), p.379-390 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3 |
---|---|
cites | cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3 |
container_end_page | 390 |
container_issue | 6 |
container_start_page | 379 |
container_title | IEEE transactions on speech and audio processing |
container_volume | 10 |
creator | Schuller, G.D.T. Bin Yu Dawei Huang Edler, B. |
description | This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay. |
doi_str_mv | 10.1109/TSA.2002.803444 |
format | article |
fullrecord | <record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_13973497</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1040262</ieee_id><sourcerecordid>28533269</sourcerecordid><originalsourceid>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</originalsourceid><addsrcrecordid>eNp9kc1LxDAQxYsoqKtnD16KoHjpbr6apkcRv0BQcD2HNJlIpNvWTCv435t1BcWDl8kw_N6bDC_LjiiZU0rqxfLpYs4IYXNFuBBiK9ujZakKxku-nXoieSFlJXezfcRXQoiildjLnh8hWhjGybS5mVzoc9u70L3kE66rcWYYwzvkQ4QiN53Lhx7Hwod2hIhfg7ZHbAExCVeJQgx9d5DteNMiHH6_s-z5-mp5eVvcP9zcXV7cF1YIORbOON-U3jhgprSuYd7zpkk_JQ4MFWkgGEDlLRBeC1BOVrUztbWNsqJpGj7Lzja-Q-zfJsBRrwJaaFvTQT-hZqrknMk6gef_glRWlAslRJXQkz_oaz_FLp2hlRK8oolK0GID2ZjOj-D1EMPKxA9NiV7HoVMceh2H3sSRFKfftgataX00nQ34I-N1xUW9Xn-84QIA_HIVhEnGPwFr7ZT0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884371844</pqid></control><display><type>article</type><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</creator><creatorcontrib>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</creatorcontrib><description>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2002.803444</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Adaptive filters ; Applied sciences ; Artificial intelligence ; Audio coding ; Coders ; Coding ; Compression ratio ; Computer science; control theory; systems ; Data compression ; Decoding ; Delay ; Encoding ; Exact sciences and technology ; Information, signal and communications theory ; Lossless ; Multiple signal classification ; Predictive coding ; Psychology ; Reduction ; Redundancy ; Sampling methods ; Signal processing ; Speech ; Speech and sound recognition and synthesis. Linguistics ; Speech processing ; Telecommunications and information theory</subject><ispartof>IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.379-390</ispartof><rights>2003 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2002</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</citedby><cites>FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1040262$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=13973497$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Schuller, G.D.T.</creatorcontrib><creatorcontrib>Bin Yu</creatorcontrib><creatorcontrib>Dawei Huang</creatorcontrib><creatorcontrib>Edler, B.</creatorcontrib><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</description><subject>Adaptive filters</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Audio coding</subject><subject>Coders</subject><subject>Coding</subject><subject>Compression ratio</subject><subject>Computer science; control theory; systems</subject><subject>Data compression</subject><subject>Decoding</subject><subject>Delay</subject><subject>Encoding</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Lossless</subject><subject>Multiple signal classification</subject><subject>Predictive coding</subject><subject>Psychology</subject><subject>Reduction</subject><subject>Redundancy</subject><subject>Sampling methods</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Speech processing</subject><subject>Telecommunications and information theory</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><recordid>eNp9kc1LxDAQxYsoqKtnD16KoHjpbr6apkcRv0BQcD2HNJlIpNvWTCv435t1BcWDl8kw_N6bDC_LjiiZU0rqxfLpYs4IYXNFuBBiK9ujZakKxku-nXoieSFlJXezfcRXQoiildjLnh8hWhjGybS5mVzoc9u70L3kE66rcWYYwzvkQ4QiN53Lhx7Hwod2hIhfg7ZHbAExCVeJQgx9d5DteNMiHH6_s-z5-mp5eVvcP9zcXV7cF1YIORbOON-U3jhgprSuYd7zpkk_JQ4MFWkgGEDlLRBeC1BOVrUztbWNsqJpGj7Lzja-Q-zfJsBRrwJaaFvTQT-hZqrknMk6gef_glRWlAslRJXQkz_oaz_FLp2hlRK8oolK0GID2ZjOj-D1EMPKxA9NiV7HoVMceh2H3sSRFKfftgataX00nQ34I-N1xUW9Xn-84QIA_HIVhEnGPwFr7ZT0</recordid><startdate>20020901</startdate><enddate>20020901</enddate><creator>Schuller, G.D.T.</creator><creator>Bin Yu</creator><creator>Dawei Huang</creator><creator>Edler, B.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20020901</creationdate><title>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</title><author>Schuller, G.D.T. ; Bin Yu ; Dawei Huang ; Edler, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Adaptive filters</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Audio coding</topic><topic>Coders</topic><topic>Coding</topic><topic>Compression ratio</topic><topic>Computer science; control theory; systems</topic><topic>Data compression</topic><topic>Decoding</topic><topic>Delay</topic><topic>Encoding</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Lossless</topic><topic>Multiple signal classification</topic><topic>Predictive coding</topic><topic>Psychology</topic><topic>Reduction</topic><topic>Redundancy</topic><topic>Sampling methods</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Speech processing</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Schuller, G.D.T.</creatorcontrib><creatorcontrib>Bin Yu</creatorcontrib><creatorcontrib>Dawei Huang</creatorcontrib><creatorcontrib>Edler, B.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics & Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Schuller, G.D.T.</au><au>Bin Yu</au><au>Dawei Huang</au><au>Edler, B.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Perceptual audio coding using adaptive pre- and post-filters and lossless compression</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2002-09-01</date><risdate>2002</risdate><volume>10</volume><issue>6</issue><spage>379</spage><epage>390</epage><pages>379-390</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><abstract>This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2002.803444</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1063-6676 |
ispartof | IEEE transactions on speech and audio processing, 2002-09, Vol.10 (6), p.379-390 |
issn | 1063-6676 2329-9290 1558-2353 2329-9304 |
language | eng |
recordid | cdi_pascalfrancis_primary_13973497 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Adaptive filters Applied sciences Artificial intelligence Audio coding Coders Coding Compression ratio Computer science control theory systems Data compression Decoding Delay Encoding Exact sciences and technology Information, signal and communications theory Lossless Multiple signal classification Predictive coding Psychology Reduction Redundancy Sampling methods Signal processing Speech Speech and sound recognition and synthesis. Linguistics Speech processing Telecommunications and information theory |
title | Perceptual audio coding using adaptive pre- and post-filters and lossless compression |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A30%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Perceptual%20audio%20coding%20using%20adaptive%20pre-%20and%20post-filters%20and%20lossless%20compression&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Schuller,%20G.D.T.&rft.date=2002-09-01&rft.volume=10&rft.issue=6&rft.spage=379&rft.epage=390&rft.pages=379-390&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2002.803444&rft_dat=%3Cproquest_pasca%3E28533269%3C/proquest_pasca%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c446t-dadfb5fade2a5cdb2ff3bb0630dea14db242ee7fce0394e8d679da9ccb8c4bbb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=884371844&rft_id=info:pmid/&rft_ieee_id=1040262&rfr_iscdi=true |