Loading…

Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution

Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the quadratic computational complexity resulting from global self-...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2023-07
Main Authors: Lu, Yuting, Lingtong Min, Wang, Binglu, Le, Zheng, Wang, Xiaoxu, Zhao, Yongqiang, Long, Teng
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Lu, Yuting
Lingtong Min
Wang, Binglu
Le, Zheng
Wang, Xiaoxu
Zhao, Yongqiang
Long, Teng
description Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the quadratic computational complexity resulting from global self-attention, various methods constrain attention to a local window, enhancing its efficiency. Consequently, the receptive fields in a single attention layer are inadequate, leading to insufficient context modeling. Furthermore, while most transform-based approaches reuse shallow features through skip connections, relying solely on these connections treats shallow and deep features equally, impeding the model's ability to characterize them. To address these issues, we propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR. Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages. The model incorporates cross-spatial pixel integration attention (CSPIA) to introduce contextual information into a local window, while cross-stage feature fusion attention (CSFFA) adaptively fuses features from the previous stage to improve feature expression in line with the requirements of the current stage. We conducted comprehensive experiments on multiple benchmark datasets, demonstrating the superior performance of our proposed SPIFFNet in terms of both quantitative metrics and visual quality when compared to state-of-the-art methods.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2834346630</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2834346630</sourcerecordid><originalsourceid>FETCH-proquest_journals_28343466303</originalsourceid><addsrcrecordid>eNqNjUELgkAUhJcgKKr_8KCzYLtmnYsiLxHZXRZ8iqa79t4ude6Xp-AP6DTMfMPMRMylUptgH0k5EyvmOgxDGe_kdqvm4nskyxyknXaVbuBWfbCBxDgsqU-sAW1yGDtOlwhn1M5Tr54HfNCMOTxIGy4stUhwRfe29ITewh1b6xBSNFyZEpJ2WEh9hxTckW3jh4ulmBa6YVyNuhDr8-lxvAQd2ZdHdlltPZkeZXKvIhXFsQrVf60fwSFRdw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2834346630</pqid></control><display><type>article</type><title>Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution</title><source>Publicly Available Content Database</source><creator>Lu, Yuting ; Lingtong Min ; Wang, Binglu ; Le, Zheng ; Wang, Xiaoxu ; Zhao, Yongqiang ; Long, Teng</creator><creatorcontrib>Lu, Yuting ; Lingtong Min ; Wang, Binglu ; Le, Zheng ; Wang, Xiaoxu ; Zhao, Yongqiang ; Long, Teng</creatorcontrib><description>Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the quadratic computational complexity resulting from global self-attention, various methods constrain attention to a local window, enhancing its efficiency. Consequently, the receptive fields in a single attention layer are inadequate, leading to insufficient context modeling. Furthermore, while most transform-based approaches reuse shallow features through skip connections, relying solely on these connections treats shallow and deep features equally, impeding the model's ability to characterize them. To address these issues, we propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR. Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages. The model incorporates cross-spatial pixel integration attention (CSPIA) to introduce contextual information into a local window, while cross-stage feature fusion attention (CSFFA) adaptively fuses features from the previous stage to improve feature expression in line with the requirements of the current stage. We conducted comprehensive experiments on multiple benchmark datasets, demonstrating the superior performance of our proposed SPIFFNet in terms of both quantitative metrics and visual quality when compared to state-of-the-art methods.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cognition ; Image enhancement ; Image quality ; Image resolution ; Pixels ; Remote sensing ; Satellite imagery</subject><ispartof>arXiv.org, 2023-07</ispartof><rights>2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2834346630?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Lu, Yuting</creatorcontrib><creatorcontrib>Lingtong Min</creatorcontrib><creatorcontrib>Wang, Binglu</creatorcontrib><creatorcontrib>Le, Zheng</creatorcontrib><creatorcontrib>Wang, Xiaoxu</creatorcontrib><creatorcontrib>Zhao, Yongqiang</creatorcontrib><creatorcontrib>Long, Teng</creatorcontrib><title>Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution</title><title>arXiv.org</title><description>Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the quadratic computational complexity resulting from global self-attention, various methods constrain attention to a local window, enhancing its efficiency. Consequently, the receptive fields in a single attention layer are inadequate, leading to insufficient context modeling. Furthermore, while most transform-based approaches reuse shallow features through skip connections, relying solely on these connections treats shallow and deep features equally, impeding the model's ability to characterize them. To address these issues, we propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR. Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages. The model incorporates cross-spatial pixel integration attention (CSPIA) to introduce contextual information into a local window, while cross-stage feature fusion attention (CSFFA) adaptively fuses features from the previous stage to improve feature expression in line with the requirements of the current stage. We conducted comprehensive experiments on multiple benchmark datasets, demonstrating the superior performance of our proposed SPIFFNet in terms of both quantitative metrics and visual quality when compared to state-of-the-art methods.</description><subject>Cognition</subject><subject>Image enhancement</subject><subject>Image quality</subject><subject>Image resolution</subject><subject>Pixels</subject><subject>Remote sensing</subject><subject>Satellite imagery</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjUELgkAUhJcgKKr_8KCzYLtmnYsiLxHZXRZ8iqa79t4ude6Xp-AP6DTMfMPMRMylUptgH0k5EyvmOgxDGe_kdqvm4nskyxyknXaVbuBWfbCBxDgsqU-sAW1yGDtOlwhn1M5Tr54HfNCMOTxIGy4stUhwRfe29ITewh1b6xBSNFyZEpJ2WEh9hxTckW3jh4ulmBa6YVyNuhDr8-lxvAQd2ZdHdlltPZkeZXKvIhXFsQrVf60fwSFRdw</recordid><startdate>20230706</startdate><enddate>20230706</enddate><creator>Lu, Yuting</creator><creator>Lingtong Min</creator><creator>Wang, Binglu</creator><creator>Le, Zheng</creator><creator>Wang, Xiaoxu</creator><creator>Zhao, Yongqiang</creator><creator>Long, Teng</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230706</creationdate><title>Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution</title><author>Lu, Yuting ; Lingtong Min ; Wang, Binglu ; Le, Zheng ; Wang, Xiaoxu ; Zhao, Yongqiang ; Long, Teng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28343466303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cognition</topic><topic>Image enhancement</topic><topic>Image quality</topic><topic>Image resolution</topic><topic>Pixels</topic><topic>Remote sensing</topic><topic>Satellite imagery</topic><toplevel>online_resources</toplevel><creatorcontrib>Lu, Yuting</creatorcontrib><creatorcontrib>Lingtong Min</creatorcontrib><creatorcontrib>Wang, Binglu</creatorcontrib><creatorcontrib>Le, Zheng</creatorcontrib><creatorcontrib>Wang, Xiaoxu</creatorcontrib><creatorcontrib>Zhao, Yongqiang</creatorcontrib><creatorcontrib>Long, Teng</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lu, Yuting</au><au>Lingtong Min</au><au>Wang, Binglu</au><au>Le, Zheng</au><au>Wang, Xiaoxu</au><au>Zhao, Yongqiang</au><au>Long, Teng</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution</atitle><jtitle>arXiv.org</jtitle><date>2023-07-06</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Remote sensing image super-resolution (RSISR) plays a vital role in enhancing spatial detials and improving the quality of satellite imagery. Recently, Transformer-based models have shown competitive performance in RSISR. To mitigate the quadratic computational complexity resulting from global self-attention, various methods constrain attention to a local window, enhancing its efficiency. Consequently, the receptive fields in a single attention layer are inadequate, leading to insufficient context modeling. Furthermore, while most transform-based approaches reuse shallow features through skip connections, relying solely on these connections treats shallow and deep features equally, impeding the model's ability to characterize them. To address these issues, we propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR. Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages. The model incorporates cross-spatial pixel integration attention (CSPIA) to introduce contextual information into a local window, while cross-stage feature fusion attention (CSFFA) adaptively fuses features from the previous stage to improve feature expression in line with the requirements of the current stage. We conducted comprehensive experiments on multiple benchmark datasets, demonstrating the superior performance of our proposed SPIFFNet in terms of both quantitative metrics and visual quality when compared to state-of-the-art methods.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-07
issn 2331-8422
language eng
recordid cdi_proquest_journals_2834346630
source Publicly Available Content Database
subjects Cognition
Image enhancement
Image quality
Image resolution
Pixels
Remote sensing
Satellite imagery
title Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T14%3A23%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Cross-Spatial%20Pixel%20Integration%20and%20Cross-Stage%20Feature%20Fusion%20Based%20Transformer%20Network%20for%20Remote%20Sensing%20Image%20Super-Resolution&rft.jtitle=arXiv.org&rft.au=Lu,%20Yuting&rft.date=2023-07-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2834346630%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28343466303%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2834346630&rft_id=info:pmid/&rfr_iscdi=true