Loading…

Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders

Since the beginning of the COVID-19 pandemic, the World Health Organization (WHO) has been tracking SARS-CoV-2 mutations. The SARS-CoV-2 consistently mutated throughout the pandemic, which resulted in many variants. A variant is a viral genome containing one or more genetic code mutations. Deep lear...

Full description

Saved in:
Bibliographic Details
Published in:Neural computing & applications 2024-11, Vol.36 (31), p.19823-19837
Main Authors: Coutinho, Maria G. F., Câmara, Gabriel B. M., Barbosa, Raquel de M., Fernandes, Marcelo A. C.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c115z-4978c30c89695349033fb33a4c1431600f0c176b4e08dbaef06ab049a1521bfc3
container_end_page 19837
container_issue 31
container_start_page 19823
container_title Neural computing & applications
container_volume 36
creator Coutinho, Maria G. F.
Câmara, Gabriel B. M.
Barbosa, Raquel de M.
Fernandes, Marcelo A. C.
description Since the beginning of the COVID-19 pandemic, the World Health Organization (WHO) has been tracking SARS-CoV-2 mutations. The SARS-CoV-2 consistently mutated throughout the pandemic, which resulted in many variants. A variant is a viral genome containing one or more genetic code mutations. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. This work proposed an effective viral genome classifier for SARS-CoV-2 variants using the deep neural network based on the stacked sparse autoencoder (SSAE). Aiming to achieve the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input. The dataset based on Chaos Game Representation (CGR) images was generated and applied to the experiments of classification of SARS-CoV-2 variants of concern (VOC). The SSAE technique provided great performance results, achieving classification accuracy of 99.9% for the validation set and 99.8% for the test set. Finally, the results indicated the relevance of using this deep learning technique in genome classification problems.
doi_str_mv 10.1007/s00521-024-10278-z
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3110546859</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3110546859</sourcerecordid><originalsourceid>FETCH-LOGICAL-c115z-4978c30c89695349033fb33a4c1431600f0c176b4e08dbaef06ab049a1521bfc3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWD_-gKeA5-hkk93uHkvxCwTBWq8hmyZtapvUzNZif73RFbx5msO8zzvMQ8gFhysOMLxGgLLgDArJOBTDmu0PyIBLIZiAsj4kA2hkXldSHJMTxCUAyKouB2Q37fzK732YU7PQEelcry1NdpMs2tDpzsdAXUzUhoUOxs6oWWlE77zpd9HRyeh5wsbxlRX0QyevQ4d057sFxU6bt4zgRie0VG-7aIOJM5vwjBw5vUJ7_jtPyfT25mV8zx6f7h7Go0dmOC_3TDbD2ggwdVM1pZANCOFaIbQ0-TleATgwfFi10kI9a7V1UOkWZKN51tE6I07JZd-7SfF9a7FTy7hNIZ9UgnMovy00OVX0KZMiYrJObZJf6_SpOKhvwaoXrLJg9SNY7TMkeghzOMxt-qv-h_oC0KJ_oQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3110546859</pqid></control><display><type>article</type><title>Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders</title><source>Springer Nature</source><creator>Coutinho, Maria G. F. ; Câmara, Gabriel B. M. ; Barbosa, Raquel de M. ; Fernandes, Marcelo A. C.</creator><creatorcontrib>Coutinho, Maria G. F. ; Câmara, Gabriel B. M. ; Barbosa, Raquel de M. ; Fernandes, Marcelo A. C.</creatorcontrib><description>Since the beginning of the COVID-19 pandemic, the World Health Organization (WHO) has been tracking SARS-CoV-2 mutations. The SARS-CoV-2 consistently mutated throughout the pandemic, which resulted in many variants. A variant is a viral genome containing one or more genetic code mutations. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. This work proposed an effective viral genome classifier for SARS-CoV-2 variants using the deep neural network based on the stacked sparse autoencoder (SSAE). Aiming to achieve the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input. The dataset based on Chaos Game Representation (CGR) images was generated and applied to the experiments of classification of SARS-CoV-2 variants of concern (VOC). The SSAE technique provided great performance results, achieving classification accuracy of 99.9% for the validation set and 99.8% for the test set. Finally, the results indicated the relevance of using this deep learning technique in genome classification problems.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-024-10278-z</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Classification ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data Mining and Knowledge Discovery ; Deep learning ; Gene sequencing ; Genetic code ; Genomes ; Image Processing and Computer Vision ; Machine learning ; Mutation ; Original Article ; Pandemics ; Probability and Statistics in Computer Science ; Representations ; Severe acute respiratory syndrome coronavirus 2 ; Viral diseases</subject><ispartof>Neural computing &amp; applications, 2024-11, Vol.36 (31), p.19823-19837</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c115z-4978c30c89695349033fb33a4c1431600f0c176b4e08dbaef06ab049a1521bfc3</cites><orcidid>0000-0001-7536-2506</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Coutinho, Maria G. F.</creatorcontrib><creatorcontrib>Câmara, Gabriel B. M.</creatorcontrib><creatorcontrib>Barbosa, Raquel de M.</creatorcontrib><creatorcontrib>Fernandes, Marcelo A. C.</creatorcontrib><title>Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders</title><title>Neural computing &amp; applications</title><addtitle>Neural Comput &amp; Applic</addtitle><description>Since the beginning of the COVID-19 pandemic, the World Health Organization (WHO) has been tracking SARS-CoV-2 mutations. The SARS-CoV-2 consistently mutated throughout the pandemic, which resulted in many variants. A variant is a viral genome containing one or more genetic code mutations. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. This work proposed an effective viral genome classifier for SARS-CoV-2 variants using the deep neural network based on the stacked sparse autoencoder (SSAE). Aiming to achieve the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input. The dataset based on Chaos Game Representation (CGR) images was generated and applied to the experiments of classification of SARS-CoV-2 variants of concern (VOC). The SSAE technique provided great performance results, achieving classification accuracy of 99.9% for the validation set and 99.8% for the test set. Finally, the results indicated the relevance of using this deep learning technique in genome classification problems.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Deep learning</subject><subject>Gene sequencing</subject><subject>Genetic code</subject><subject>Genomes</subject><subject>Image Processing and Computer Vision</subject><subject>Machine learning</subject><subject>Mutation</subject><subject>Original Article</subject><subject>Pandemics</subject><subject>Probability and Statistics in Computer Science</subject><subject>Representations</subject><subject>Severe acute respiratory syndrome coronavirus 2</subject><subject>Viral diseases</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWD_-gKeA5-hkk93uHkvxCwTBWq8hmyZtapvUzNZif73RFbx5msO8zzvMQ8gFhysOMLxGgLLgDArJOBTDmu0PyIBLIZiAsj4kA2hkXldSHJMTxCUAyKouB2Q37fzK732YU7PQEelcry1NdpMs2tDpzsdAXUzUhoUOxs6oWWlE77zpd9HRyeh5wsbxlRX0QyevQ4d057sFxU6bt4zgRie0VG-7aIOJM5vwjBw5vUJ7_jtPyfT25mV8zx6f7h7Go0dmOC_3TDbD2ggwdVM1pZANCOFaIbQ0-TleATgwfFi10kI9a7V1UOkWZKN51tE6I07JZd-7SfF9a7FTy7hNIZ9UgnMovy00OVX0KZMiYrJObZJf6_SpOKhvwaoXrLJg9SNY7TMkeghzOMxt-qv-h_oC0KJ_oQ</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Coutinho, Maria G. F.</creator><creator>Câmara, Gabriel B. M.</creator><creator>Barbosa, Raquel de M.</creator><creator>Fernandes, Marcelo A. C.</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-7536-2506</orcidid></search><sort><creationdate>20241101</creationdate><title>Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders</title><author>Coutinho, Maria G. F. ; Câmara, Gabriel B. M. ; Barbosa, Raquel de M. ; Fernandes, Marcelo A. C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c115z-4978c30c89695349033fb33a4c1431600f0c176b4e08dbaef06ab049a1521bfc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Deep learning</topic><topic>Gene sequencing</topic><topic>Genetic code</topic><topic>Genomes</topic><topic>Image Processing and Computer Vision</topic><topic>Machine learning</topic><topic>Mutation</topic><topic>Original Article</topic><topic>Pandemics</topic><topic>Probability and Statistics in Computer Science</topic><topic>Representations</topic><topic>Severe acute respiratory syndrome coronavirus 2</topic><topic>Viral diseases</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Coutinho, Maria G. F.</creatorcontrib><creatorcontrib>Câmara, Gabriel B. M.</creatorcontrib><creatorcontrib>Barbosa, Raquel de M.</creatorcontrib><creatorcontrib>Fernandes, Marcelo A. C.</creatorcontrib><collection>CrossRef</collection><jtitle>Neural computing &amp; applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Coutinho, Maria G. F.</au><au>Câmara, Gabriel B. M.</au><au>Barbosa, Raquel de M.</au><au>Fernandes, Marcelo A. C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders</atitle><jtitle>Neural computing &amp; applications</jtitle><stitle>Neural Comput &amp; Applic</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>36</volume><issue>31</issue><spage>19823</spage><epage>19837</epage><pages>19823-19837</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Since the beginning of the COVID-19 pandemic, the World Health Organization (WHO) has been tracking SARS-CoV-2 mutations. The SARS-CoV-2 consistently mutated throughout the pandemic, which resulted in many variants. A variant is a viral genome containing one or more genetic code mutations. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. This work proposed an effective viral genome classifier for SARS-CoV-2 variants using the deep neural network based on the stacked sparse autoencoder (SSAE). Aiming to achieve the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input. The dataset based on Chaos Game Representation (CGR) images was generated and applied to the experiments of classification of SARS-CoV-2 variants of concern (VOC). The SSAE technique provided great performance results, achieving classification accuracy of 99.9% for the validation set and 99.8% for the test set. Finally, the results indicated the relevance of using this deep learning technique in genome classification problems.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-024-10278-z</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-7536-2506</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0941-0643
ispartof Neural computing & applications, 2024-11, Vol.36 (31), p.19823-19837
issn 0941-0643
1433-3058
language eng
recordid cdi_proquest_journals_3110546859
source Springer Nature
subjects Artificial Intelligence
Artificial neural networks
Classification
Computational Biology/Bioinformatics
Computational Science and Engineering
Computer Science
Data Mining and Knowledge Discovery
Deep learning
Gene sequencing
Genetic code
Genomes
Image Processing and Computer Vision
Machine learning
Mutation
Original Article
Pandemics
Probability and Statistics in Computer Science
Representations
Severe acute respiratory syndrome coronavirus 2
Viral diseases
title Utilizing chaos game representation for enhanced classification of SARS-CoV-2 variants with stacked sparse autoencoders
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T01%3A25%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Utilizing%20chaos%20game%20representation%20for%20enhanced%20classification%20of%20SARS-CoV-2%20variants%20with%20stacked%20sparse%20autoencoders&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Coutinho,%20Maria%20G.%20F.&rft.date=2024-11-01&rft.volume=36&rft.issue=31&rft.spage=19823&rft.epage=19837&rft.pages=19823-19837&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-024-10278-z&rft_dat=%3Cproquest_cross%3E3110546859%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c115z-4978c30c89695349033fb33a4c1431600f0c176b4e08dbaef06ab049a1521bfc3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3110546859&rft_id=info:pmid/&rfr_iscdi=true