Loading…
Predicting subcellular location of protein with evolution information and sequence-based deep learning
Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins....
Saved in:
Published in: | BMC bioinformatics 2021-10, Vol.22 (1), p.1-515, Article 515 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683 |
---|---|
cites | cdi_FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683 |
container_end_page | 515 |
container_issue | 1 |
container_start_page | 1 |
container_title | BMC bioinformatics |
container_volume | 22 |
creator | Liao, Zhijun Pan, Gaofeng Sun, Chao Tang, Jijun |
description | Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location. |
doi_str_mv | 10.1186/s12859-021-04404-0 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_6219b9b6c2e54e91ad147c4030d46847</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A681639345</galeid><doaj_id>oai_doaj_org_article_6219b9b6c2e54e91ad147c4030d46847</doaj_id><sourcerecordid>A681639345</sourcerecordid><originalsourceid>FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683</originalsourceid><addsrcrecordid>eNptkktv1DAQxyMEoqXwBThF4gKHFL_jXJCqisdKlUA8ztbEnmy9ytqLnfTx7fFuKmAR8sGjmd_8x_5rquolJeeUavU2U6Zl1xBGGyIEEQ15VJ1S0dKGUSIf_xWfVM9y3hBCW03k0-qEC6UVley0Gr4kdN5OPqzrPPcWx3EeIdVjtDD5GOo41LsUJ_ShvvXTdY03cZwPFR-GmLYLBcHVGX_OGCw2PWR0tUPc1SNCCkX7efVkgDHji4f7rPrx4f33y0_N1eePq8uLq8bKVkwNtM72ykJLKRJWgqHth4G3WvKh585qDQw0AdU5lC1IZZnouLKO6V5qpflZtVp0XYSN2SW_hXRvInhzSMS0NpAmb0c0itGu78o0hlJgR8EVu6wgnLjijmiL1rtFazf3W3QWw5RgPBI9rgR_bdbxxpTXdprRIvD6QSDFYk2ezNbnvcMQMM7ZMFnGaCYJL-irf9BNnFMoVhWq07rjXJE_1BrKB_b-l7l2L2oulKaKd1zIQp3_hyrH4dbbGHDwJX_U8OaooTAT3k1rmHM2q29fj1m2sDbFnBMOv_2gxOy30ixbacpWmsNWGsJ_AXaG0t4</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2598893360</pqid></control><display><type>article</type><title>Predicting subcellular location of protein with evolution information and sequence-based deep learning</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Liao, Zhijun ; Pan, Gaofeng ; Sun, Chao ; Tang, Jijun</creator><creatorcontrib>Liao, Zhijun ; Pan, Gaofeng ; Sun, Chao ; Tang, Jijun</creatorcontrib><description>Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-021-04404-0</identifier><identifier>PMID: 34686152</identifier><language>eng</language><publisher>London: BioMed Central Ltd</publisher><subject>Accuracy ; Algorithms ; Amino acid sequence ; Amino acids ; Artificial intelligence ; Artificial neural networks ; Classification ; Datasets ; Deep learning ; Evolution ; Evolution information ; Feature extraction ; Learning algorithms ; Localization ; Long short-term memory ; Machine learning ; Methods ; Multiple label classification ; Neural networks ; Predictions ; Protein research ; Protein sequence ; Proteins ; Subcellular prediction ; Support vector machines</subject><ispartof>BMC bioinformatics, 2021-10, Vol.22 (1), p.1-515, Article 515</ispartof><rights>COPYRIGHT 2021 BioMed Central Ltd.</rights><rights>2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683</citedby><cites>FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683</cites><orcidid>0000-0002-6377-536X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8539821/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2598893360?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793</link.rule.ids></links><search><creatorcontrib>Liao, Zhijun</creatorcontrib><creatorcontrib>Pan, Gaofeng</creatorcontrib><creatorcontrib>Sun, Chao</creatorcontrib><creatorcontrib>Tang, Jijun</creatorcontrib><title>Predicting subcellular location of protein with evolution information and sequence-based deep learning</title><title>BMC bioinformatics</title><description>Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Amino acid sequence</subject><subject>Amino acids</subject><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Evolution</subject><subject>Evolution information</subject><subject>Feature extraction</subject><subject>Learning algorithms</subject><subject>Localization</subject><subject>Long short-term memory</subject><subject>Machine learning</subject><subject>Methods</subject><subject>Multiple label classification</subject><subject>Neural networks</subject><subject>Predictions</subject><subject>Protein research</subject><subject>Protein sequence</subject><subject>Proteins</subject><subject>Subcellular prediction</subject><subject>Support vector machines</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkktv1DAQxyMEoqXwBThF4gKHFL_jXJCqisdKlUA8ztbEnmy9ytqLnfTx7fFuKmAR8sGjmd_8x_5rquolJeeUavU2U6Zl1xBGGyIEEQ15VJ1S0dKGUSIf_xWfVM9y3hBCW03k0-qEC6UVley0Gr4kdN5OPqzrPPcWx3EeIdVjtDD5GOo41LsUJ_ShvvXTdY03cZwPFR-GmLYLBcHVGX_OGCw2PWR0tUPc1SNCCkX7efVkgDHji4f7rPrx4f33y0_N1eePq8uLq8bKVkwNtM72ykJLKRJWgqHth4G3WvKh585qDQw0AdU5lC1IZZnouLKO6V5qpflZtVp0XYSN2SW_hXRvInhzSMS0NpAmb0c0itGu78o0hlJgR8EVu6wgnLjijmiL1rtFazf3W3QWw5RgPBI9rgR_bdbxxpTXdprRIvD6QSDFYk2ezNbnvcMQMM7ZMFnGaCYJL-irf9BNnFMoVhWq07rjXJE_1BrKB_b-l7l2L2oulKaKd1zIQp3_hyrH4dbbGHDwJX_U8OaooTAT3k1rmHM2q29fj1m2sDbFnBMOv_2gxOy30ixbacpWmsNWGsJ_AXaG0t4</recordid><startdate>20211022</startdate><enddate>20211022</enddate><creator>Liao, Zhijun</creator><creator>Pan, Gaofeng</creator><creator>Sun, Chao</creator><creator>Tang, Jijun</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6377-536X</orcidid></search><sort><creationdate>20211022</creationdate><title>Predicting subcellular location of protein with evolution information and sequence-based deep learning</title><author>Liao, Zhijun ; Pan, Gaofeng ; Sun, Chao ; Tang, Jijun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Amino acid sequence</topic><topic>Amino acids</topic><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Evolution</topic><topic>Evolution information</topic><topic>Feature extraction</topic><topic>Learning algorithms</topic><topic>Localization</topic><topic>Long short-term memory</topic><topic>Machine learning</topic><topic>Methods</topic><topic>Multiple label classification</topic><topic>Neural networks</topic><topic>Predictions</topic><topic>Protein research</topic><topic>Protein sequence</topic><topic>Proteins</topic><topic>Subcellular prediction</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liao, Zhijun</creatorcontrib><creatorcontrib>Pan, Gaofeng</creatorcontrib><creatorcontrib>Sun, Chao</creatorcontrib><creatorcontrib>Tang, Jijun</creatorcontrib><collection>CrossRef</collection><collection>Science in Context</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Databases</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liao, Zhijun</au><au>Pan, Gaofeng</au><au>Sun, Chao</au><au>Tang, Jijun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting subcellular location of protein with evolution information and sequence-based deep learning</atitle><jtitle>BMC bioinformatics</jtitle><date>2021-10-22</date><risdate>2021</risdate><volume>22</volume><issue>1</issue><spage>1</spage><epage>515</epage><pages>1-515</pages><artnum>515</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location.</abstract><cop>London</cop><pub>BioMed Central Ltd</pub><pmid>34686152</pmid><doi>10.1186/s12859-021-04404-0</doi><orcidid>https://orcid.org/0000-0002-6377-536X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2021-10, Vol.22 (1), p.1-515, Article 515 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_6219b9b6c2e54e91ad147c4030d46847 |
source | Publicly Available Content Database; PubMed Central |
subjects | Accuracy Algorithms Amino acid sequence Amino acids Artificial intelligence Artificial neural networks Classification Datasets Deep learning Evolution Evolution information Feature extraction Learning algorithms Localization Long short-term memory Machine learning Methods Multiple label classification Neural networks Predictions Protein research Protein sequence Proteins Subcellular prediction Support vector machines |
title | Predicting subcellular location of protein with evolution information and sequence-based deep learning |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T00%3A30%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20subcellular%20location%20of%20protein%20with%20evolution%20information%20and%20sequence-based%20deep%20learning&rft.jtitle=BMC%20bioinformatics&rft.au=Liao,%20Zhijun&rft.date=2021-10-22&rft.volume=22&rft.issue=1&rft.spage=1&rft.epage=515&rft.pages=1-515&rft.artnum=515&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-021-04404-0&rft_dat=%3Cgale_doaj_%3EA681639345%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c574t-a7dcb6ca711e026caf7bff37853fb3dc88a2a80a69de57a56c24936cd28b58683%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2598893360&rft_id=info:pmid/34686152&rft_galeid=A681639345&rfr_iscdi=true |