Loading…
Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution
Audio super resolution aims to predict the missing high resolution components of the low resolution audio signals. While audio in nature is a continuous signal, current approaches treat it as discrete data (i.e., input is defined on discrete time domain), and consider the super resolution over a fix...
Saved in:
Published in: | arXiv.org 2022-03 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Kim, Jaechang Lee, Yunjoo Hong, Seunghoon Jungseul Ok |
description | Audio super resolution aims to predict the missing high resolution components of the low resolution audio signals. While audio in nature is a continuous signal, current approaches treat it as discrete data (i.e., input is defined on discrete time domain), and consider the super resolution over a fixed scale factor (i.e., it is required to train a new neural network to change output resolution). To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA). Our method locally parameterizes a chunk of audio as a function of continuous time, and represents each chunk with the local latent codes of neighboring chunks so that the function can extrapolate the signal at any time coordinate, i.e., infinite resolution. To learn a continuous representation for audio, we design a self-supervised learning strategy to practice super resolution tasks up to the original resolution by stochastic selection. Our numerical evaluation shows that LISA outperforms the previous fixed-scale methods with a fraction of parameters, but also is capable of arbitrary scale super resolution even beyond the resolution of training data. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2591830271</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2591830271</sourcerecordid><originalsourceid>FETCH-proquest_journals_25918302713</originalsourceid><addsrcrecordid>eNqNi7EKwjAUAIMgWLT_8MC50CbW1rEUxcFBrHuJ-iopJa--JIN_bwU_wOmGu5uJSCqVJeVGyoWInevTNJXbQua5isT5hJqtsU-oyXpjAwUHFxwZHVqvvSEL1EEVHoagI4aKb8az5jc0dz0gNGFEng5HQ_jWKzHv9OAw_nEp1of9tT4mI9MroPNtT4HtpFqZ77JSpbLI1H_VB6FxP2Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2591830271</pqid></control><display><type>article</type><title>Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution</title><source>Publicly Available Content Database</source><creator>Kim, Jaechang ; Lee, Yunjoo ; Hong, Seunghoon ; Jungseul Ok</creator><creatorcontrib>Kim, Jaechang ; Lee, Yunjoo ; Hong, Seunghoon ; Jungseul Ok</creatorcontrib><description>Audio super resolution aims to predict the missing high resolution components of the low resolution audio signals. While audio in nature is a continuous signal, current approaches treat it as discrete data (i.e., input is defined on discrete time domain), and consider the super resolution over a fixed scale factor (i.e., it is required to train a new neural network to change output resolution). To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA). Our method locally parameterizes a chunk of audio as a function of continuous time, and represents each chunk with the local latent codes of neighboring chunks so that the function can extrapolate the signal at any time coordinate, i.e., infinite resolution. To learn a continuous representation for audio, we design a self-supervised learning strategy to practice super resolution tasks up to the original resolution by stochastic selection. Our numerical evaluation shows that LISA outperforms the previous fixed-scale methods with a fraction of parameters, but also is capable of arbitrary scale super resolution even beyond the resolution of training data.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Audio signals ; Continuity (mathematics) ; Neural networks ; Representations ; Supervised learning</subject><ispartof>arXiv.org, 2022-03</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2591830271?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Kim, Jaechang</creatorcontrib><creatorcontrib>Lee, Yunjoo</creatorcontrib><creatorcontrib>Hong, Seunghoon</creatorcontrib><creatorcontrib>Jungseul Ok</creatorcontrib><title>Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution</title><title>arXiv.org</title><description>Audio super resolution aims to predict the missing high resolution components of the low resolution audio signals. While audio in nature is a continuous signal, current approaches treat it as discrete data (i.e., input is defined on discrete time domain), and consider the super resolution over a fixed scale factor (i.e., it is required to train a new neural network to change output resolution). To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA). Our method locally parameterizes a chunk of audio as a function of continuous time, and represents each chunk with the local latent codes of neighboring chunks so that the function can extrapolate the signal at any time coordinate, i.e., infinite resolution. To learn a continuous representation for audio, we design a self-supervised learning strategy to practice super resolution tasks up to the original resolution by stochastic selection. Our numerical evaluation shows that LISA outperforms the previous fixed-scale methods with a fraction of parameters, but also is capable of arbitrary scale super resolution even beyond the resolution of training data.</description><subject>Audio signals</subject><subject>Continuity (mathematics)</subject><subject>Neural networks</subject><subject>Representations</subject><subject>Supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNi7EKwjAUAIMgWLT_8MC50CbW1rEUxcFBrHuJ-iopJa--JIN_bwU_wOmGu5uJSCqVJeVGyoWInevTNJXbQua5isT5hJqtsU-oyXpjAwUHFxwZHVqvvSEL1EEVHoagI4aKb8az5jc0dz0gNGFEng5HQ_jWKzHv9OAw_nEp1of9tT4mI9MroPNtT4HtpFqZ77JSpbLI1H_VB6FxP2Q</recordid><startdate>20220330</startdate><enddate>20220330</enddate><creator>Kim, Jaechang</creator><creator>Lee, Yunjoo</creator><creator>Hong, Seunghoon</creator><creator>Jungseul Ok</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220330</creationdate><title>Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution</title><author>Kim, Jaechang ; Lee, Yunjoo ; Hong, Seunghoon ; Jungseul Ok</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25918302713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Audio signals</topic><topic>Continuity (mathematics)</topic><topic>Neural networks</topic><topic>Representations</topic><topic>Supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Jaechang</creatorcontrib><creatorcontrib>Lee, Yunjoo</creatorcontrib><creatorcontrib>Hong, Seunghoon</creatorcontrib><creatorcontrib>Jungseul Ok</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kim, Jaechang</au><au>Lee, Yunjoo</au><au>Hong, Seunghoon</au><au>Jungseul Ok</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution</atitle><jtitle>arXiv.org</jtitle><date>2022-03-30</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Audio super resolution aims to predict the missing high resolution components of the low resolution audio signals. While audio in nature is a continuous signal, current approaches treat it as discrete data (i.e., input is defined on discrete time domain), and consider the super resolution over a fixed scale factor (i.e., it is required to train a new neural network to change output resolution). To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA). Our method locally parameterizes a chunk of audio as a function of continuous time, and represents each chunk with the local latent codes of neighboring chunks so that the function can extrapolate the signal at any time coordinate, i.e., infinite resolution. To learn a continuous representation for audio, we design a self-supervised learning strategy to practice super resolution tasks up to the original resolution by stochastic selection. Our numerical evaluation shows that LISA outperforms the previous fixed-scale methods with a fraction of parameters, but also is capable of arbitrary scale super resolution even beyond the resolution of training data.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2022-03 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2591830271 |
source | Publicly Available Content Database |
subjects | Audio signals Continuity (mathematics) Neural networks Representations Supervised learning |
title | Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T15%3A22%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Learning%20Continuous%20Representation%20of%20Audio%20for%20Arbitrary%20Scale%20Super%20Resolution&rft.jtitle=arXiv.org&rft.au=Kim,%20Jaechang&rft.date=2022-03-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2591830271%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_25918302713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2591830271&rft_id=info:pmid/&rfr_iscdi=true |