Loading…

Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification

Hyperspectral image (HSI) classification, due to its characteristic combination of images and spectra, has important applications in various fields through pixel-level image classification. The fusion of spatial–spectral features is a topic of great interest in the context of hyperspectral image cla...

Full description

Saved in:

Bibliographic Details
Published in:	Remote sensing (Basel, Switzerland) Switzerland), 2023-11, Vol.15 (21), p.5208
Main Authors:	Liu, Jun, Guo, Haoran, He, Yile, Li, Huali
Format:	Article
Language:	English
Subjects:	Accuracy Classification Computational linguistics Computer vision Deep learning Discriminant analysis Ensemble learning hyperspectral image classification Hyperspectral imaging Image classification Language processing Learning Machine learning Machine vision Medical screening Natural language interfaces Neural networks Pixels Remote sensing Spatial distribution spatial shuffle Support vector machines Training vision transformer Wavelet transforms
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c350t-51fefbf18c1bcf56877e1acea5802bf72b991b0f18b12e76c19c2820ec7e48aa3
container_end_page
container_issue	21
container_start_page	5208
container_title	Remote sensing (Basel, Switzerland)
container_volume	15
creator	Liu, Jun Guo, Haoran He, Yile Li, Huali
description	Hyperspectral image (HSI) classification, due to its characteristic combination of images and spectra, has important applications in various fields through pixel-level image classification. The fusion of spatial–spectral features is a topic of great interest in the context of hyperspectral image classification, which typically requires selecting a larger spatial neighborhood window, potentially leading to overlaps between training and testing samples. Vision Transformer (ViTs), with their powerful global modeling abilities, have had a significant impact in the field of computer vision through various variants. In this study, an ensemble learning framework for HSI classification is proposed by integrating multiple variants of ViTs, achieving high-precision pixel-level classification. Firstly, the spatial shuffle operation was introduced to preprocess the training samples for HSI classification. By randomly shuffling operations using smaller spatial neighborhood windows, a greater potential spatial distribution of pixels can be described. Then, the training samples were transformed from a 3D cube to a 2D image, and a learning framework was built by integrating seven ViT variants. Finally, a two-level ensemble strategy was employed to achieve pixel-level classification based on the results of multiple ViT variants. Our experimental results demonstrate that the proposed ensemble learning framework achieves stable and significantly high classification accuracy on multiple publicly available HSI datasets. The proposed method also shows notable classification performance with varying numbers of training samples. Moreover, herein, it is proven that the spatial shuffle operation plays a crucial role in improving classification accuracy. By introducing superior individual classifiers, the proposed ensemble framework is expected to achieve even better classification performance.
doi_str_mv	10.3390/rs15215208
format	article
fullrecord	<record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_31188a2704074c4eb62f0f4d5ddd1f38</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A772535890</galeid><doaj_id>oai_doaj_org_article_31188a2704074c4eb62f0f4d5ddd1f38</doaj_id><sourcerecordid>A772535890</sourcerecordid><originalsourceid>FETCH-LOGICAL-c350t-51fefbf18c1bcf56877e1acea5802bf72b991b0f18b12e76c19c2820ec7e48aa3</originalsourceid><addsrcrecordid>eNptkd9LwzAQx4soOHQv_gUF34TO_Gia9HGOqYOBCOpruKaXkdE2M6kP---NTvwBXg4u3H3vQ3KXZReUzDivyXWIVLDkRB1lE0YkK0pWs-Nf99NsGuOWJOOc1qScZI8vLjo_5E8Bhmh96DEUNxCxzZdDxL7pMF8jhMENmzyV8_v9DkPcoRkDdPmqhw3miw5idNYZGBPqPDux0EWcfsWz7Pl2-bS4L9YPd6vFfF0YLshYCGrRNpYqQxtjRaWkRAoGQSjCGitZU9e0IUnQUIayMrQ2TDGCRmKpAPhZtjpwWw9bvQuuh7DXHpz-TPiw0RBGZzrUnFKlgElSElmaEpuKWWLLVrRtSy1XiXV5YO2Cf33DOOqtfwtDer5mSikuqorQH9UGEtQN1qchmN5Fo-dSMsGFqklSzf5RpdNi74wf0LqU_9NwdWgwwccY0H5_hhL9sVj9s1j-DiZvk4s</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2888356601</pqid></control><display><type>article</type><title>Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification</title><source>Publicly Available Content (ProQuest)</source><creator>Liu, Jun ; Guo, Haoran ; He, Yile ; Li, Huali</creator><creatorcontrib>Liu, Jun ; Guo, Haoran ; He, Yile ; Li, Huali</creatorcontrib><description>Hyperspectral image (HSI) classification, due to its characteristic combination of images and spectra, has important applications in various fields through pixel-level image classification. The fusion of spatial–spectral features is a topic of great interest in the context of hyperspectral image classification, which typically requires selecting a larger spatial neighborhood window, potentially leading to overlaps between training and testing samples. Vision Transformer (ViTs), with their powerful global modeling abilities, have had a significant impact in the field of computer vision through various variants. In this study, an ensemble learning framework for HSI classification is proposed by integrating multiple variants of ViTs, achieving high-precision pixel-level classification. Firstly, the spatial shuffle operation was introduced to preprocess the training samples for HSI classification. By randomly shuffling operations using smaller spatial neighborhood windows, a greater potential spatial distribution of pixels can be described. Then, the training samples were transformed from a 3D cube to a 2D image, and a learning framework was built by integrating seven ViT variants. Finally, a two-level ensemble strategy was employed to achieve pixel-level classification based on the results of multiple ViT variants. Our experimental results demonstrate that the proposed ensemble learning framework achieves stable and significantly high classification accuracy on multiple publicly available HSI datasets. The proposed method also shows notable classification performance with varying numbers of training samples. Moreover, herein, it is proven that the spatial shuffle operation plays a crucial role in improving classification accuracy. By introducing superior individual classifiers, the proposed ensemble framework is expected to achieve even better classification performance.</description><identifier>ISSN: 2072-4292</identifier><identifier>EISSN: 2072-4292</identifier><identifier>DOI: 10.3390/rs15215208</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Accuracy ; Classification ; Computational linguistics ; Computer vision ; Deep learning ; Discriminant analysis ; Ensemble learning ; hyperspectral image classification ; Hyperspectral imaging ; Image classification ; Language processing ; Learning ; Machine learning ; Machine vision ; Medical screening ; Natural language interfaces ; Neural networks ; Pixels ; Remote sensing ; Spatial distribution ; spatial shuffle ; Support vector machines ; Training ; vision transformer ; Wavelet transforms</subject><ispartof>Remote sensing (Basel, Switzerland), 2023-11, Vol.15 (21), p.5208</ispartof><rights>COPYRIGHT 2023 MDPI AG</rights><rights>2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c350t-51fefbf18c1bcf56877e1acea5802bf72b991b0f18b12e76c19c2820ec7e48aa3</cites><orcidid>0000-0002-7280-1443</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2888356601/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2888356601?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,25731,27901,27902,36989,44566,75096</link.rule.ids></links><search><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Guo, Haoran</creatorcontrib><creatorcontrib>He, Yile</creatorcontrib><creatorcontrib>Li, Huali</creatorcontrib><title>Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification</title><title>Remote sensing (Basel, Switzerland)</title><description>Hyperspectral image (HSI) classification, due to its characteristic combination of images and spectra, has important applications in various fields through pixel-level image classification. The fusion of spatial–spectral features is a topic of great interest in the context of hyperspectral image classification, which typically requires selecting a larger spatial neighborhood window, potentially leading to overlaps between training and testing samples. Vision Transformer (ViTs), with their powerful global modeling abilities, have had a significant impact in the field of computer vision through various variants. In this study, an ensemble learning framework for HSI classification is proposed by integrating multiple variants of ViTs, achieving high-precision pixel-level classification. Firstly, the spatial shuffle operation was introduced to preprocess the training samples for HSI classification. By randomly shuffling operations using smaller spatial neighborhood windows, a greater potential spatial distribution of pixels can be described. Then, the training samples were transformed from a 3D cube to a 2D image, and a learning framework was built by integrating seven ViT variants. Finally, a two-level ensemble strategy was employed to achieve pixel-level classification based on the results of multiple ViT variants. Our experimental results demonstrate that the proposed ensemble learning framework achieves stable and significantly high classification accuracy on multiple publicly available HSI datasets. The proposed method also shows notable classification performance with varying numbers of training samples. Moreover, herein, it is proven that the spatial shuffle operation plays a crucial role in improving classification accuracy. By introducing superior individual classifiers, the proposed ensemble framework is expected to achieve even better classification performance.</description><subject>Accuracy</subject><subject>Classification</subject><subject>Computational linguistics</subject><subject>Computer vision</subject><subject>Deep learning</subject><subject>Discriminant analysis</subject><subject>Ensemble learning</subject><subject>hyperspectral image classification</subject><subject>Hyperspectral imaging</subject><subject>Image classification</subject><subject>Language processing</subject><subject>Learning</subject><subject>Machine learning</subject><subject>Machine vision</subject><subject>Medical screening</subject><subject>Natural language interfaces</subject><subject>Neural networks</subject><subject>Pixels</subject><subject>Remote sensing</subject><subject>Spatial distribution</subject><subject>spatial shuffle</subject><subject>Support vector machines</subject><subject>Training</subject><subject>vision transformer</subject><subject>Wavelet transforms</subject><issn>2072-4292</issn><issn>2072-4292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkd9LwzAQx4soOHQv_gUF34TO_Gia9HGOqYOBCOpruKaXkdE2M6kP---NTvwBXg4u3H3vQ3KXZReUzDivyXWIVLDkRB1lE0YkK0pWs-Nf99NsGuOWJOOc1qScZI8vLjo_5E8Bhmh96DEUNxCxzZdDxL7pMF8jhMENmzyV8_v9DkPcoRkDdPmqhw3miw5idNYZGBPqPDux0EWcfsWz7Pl2-bS4L9YPd6vFfF0YLshYCGrRNpYqQxtjRaWkRAoGQSjCGitZU9e0IUnQUIayMrQ2TDGCRmKpAPhZtjpwWw9bvQuuh7DXHpz-TPiw0RBGZzrUnFKlgElSElmaEpuKWWLLVrRtSy1XiXV5YO2Cf33DOOqtfwtDer5mSikuqorQH9UGEtQN1qchmN5Fo-dSMsGFqklSzf5RpdNi74wf0LqU_9NwdWgwwccY0H5_hhL9sVj9s1j-DiZvk4s</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Liu, Jun</creator><creator>Guo, Haoran</creator><creator>He, Yile</creator><creator>Li, Huali</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7QR</scope><scope>7SC</scope><scope>7SE</scope><scope>7SN</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>HCIFZ</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PCBAR</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-7280-1443</orcidid></search><sort><creationdate>20231101</creationdate><title>Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification</title><author>Liu, Jun ; Guo, Haoran ; He, Yile ; Li, Huali</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c350t-51fefbf18c1bcf56877e1acea5802bf72b991b0f18b12e76c19c2820ec7e48aa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Classification</topic><topic>Computational linguistics</topic><topic>Computer vision</topic><topic>Deep learning</topic><topic>Discriminant analysis</topic><topic>Ensemble learning</topic><topic>hyperspectral image classification</topic><topic>Hyperspectral imaging</topic><topic>Image classification</topic><topic>Language processing</topic><topic>Learning</topic><topic>Machine learning</topic><topic>Machine vision</topic><topic>Medical screening</topic><topic>Natural language interfaces</topic><topic>Neural networks</topic><topic>Pixels</topic><topic>Remote sensing</topic><topic>Spatial distribution</topic><topic>spatial shuffle</topic><topic>Support vector machines</topic><topic>Training</topic><topic>vision transformer</topic><topic>Wavelet transforms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Guo, Haoran</creatorcontrib><creatorcontrib>He, Yile</creatorcontrib><creatorcontrib>Li, Huali</creatorcontrib><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Ecology Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>SciTech Premium Collection</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ProQuest Engineering Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest Earth, Atmospheric & Aquatic Science Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>Directory of Open Access Journals</collection><jtitle>Remote sensing (Basel, Switzerland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Jun</au><au>Guo, Haoran</au><au>He, Yile</au><au>Li, Huali</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification</atitle><jtitle>Remote sensing (Basel, Switzerland)</jtitle><date>2023-11-01</date><risdate>2023</risdate><volume>15</volume><issue>21</issue><spage>5208</spage><pages>5208-</pages><issn>2072-4292</issn><eissn>2072-4292</eissn><abstract>Hyperspectral image (HSI) classification, due to its characteristic combination of images and spectra, has important applications in various fields through pixel-level image classification. The fusion of spatial–spectral features is a topic of great interest in the context of hyperspectral image classification, which typically requires selecting a larger spatial neighborhood window, potentially leading to overlaps between training and testing samples. Vision Transformer (ViTs), with their powerful global modeling abilities, have had a significant impact in the field of computer vision through various variants. In this study, an ensemble learning framework for HSI classification is proposed by integrating multiple variants of ViTs, achieving high-precision pixel-level classification. Firstly, the spatial shuffle operation was introduced to preprocess the training samples for HSI classification. By randomly shuffling operations using smaller spatial neighborhood windows, a greater potential spatial distribution of pixels can be described. Then, the training samples were transformed from a 3D cube to a 2D image, and a learning framework was built by integrating seven ViT variants. Finally, a two-level ensemble strategy was employed to achieve pixel-level classification based on the results of multiple ViT variants. Our experimental results demonstrate that the proposed ensemble learning framework achieves stable and significantly high classification accuracy on multiple publicly available HSI datasets. The proposed method also shows notable classification performance with varying numbers of training samples. Moreover, herein, it is proven that the spatial shuffle operation plays a crucial role in improving classification accuracy. By introducing superior individual classifiers, the proposed ensemble framework is expected to achieve even better classification performance.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/rs15215208</doi><orcidid>https://orcid.org/0000-0002-7280-1443</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2072-4292
ispartof	Remote sensing (Basel, Switzerland), 2023-11, Vol.15 (21), p.5208
issn	2072-4292 2072-4292
language	eng
recordid	cdi_doaj_primary_oai_doaj_org_article_31188a2704074c4eb62f0f4d5ddd1f38
source	Publicly Available Content (ProQuest)
subjects	Accuracy Classification Computational linguistics Computer vision Deep learning Discriminant analysis Ensemble learning hyperspectral image classification Hyperspectral imaging Image classification Language processing Learning Machine learning Machine vision Medical screening Natural language interfaces Neural networks Pixels Remote sensing Spatial distribution spatial shuffle Support vector machines Training vision transformer Wavelet transforms
title	Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-23T20%3A36%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Vision%20Transformer-Based%20Ensemble%20Learning%20for%20Hyperspectral%20Image%20Classification&rft.jtitle=Remote%20sensing%20(Basel,%20Switzerland)&rft.au=Liu,%20Jun&rft.date=2023-11-01&rft.volume=15&rft.issue=21&rft.spage=5208&rft.pages=5208-&rft.issn=2072-4292&rft.eissn=2072-4292&rft_id=info:doi/10.3390/rs15215208&rft_dat=%3Cgale_doaj_%3EA772535890%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c350t-51fefbf18c1bcf56877e1acea5802bf72b991b0f18b12e76c19c2820ec7e48aa3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2888356601&rft_id=info:pmid/&rft_galeid=A772535890&rfr_iscdi=true