Loading…
NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, su...
Saved in:
Published in: | Frontiers in genetics 2023-07, Vol.14, p.1226905-1226905 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3 |
---|---|
cites | cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3 |
container_end_page | 1226905 |
container_issue | |
container_start_page | 1226905 |
container_title | Frontiers in genetics |
container_volume | 14 |
creator | Liu, Di Lin, Zhengkui Jia, Cangzhi |
description | Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides. |
doi_str_mv | 10.3389/fgene.2023.1226905 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce</doaj_id><sourcerecordid>2850720627</sourcerecordid><originalsourceid>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</originalsourceid><addsrcrecordid>eNpVkk1vEzEQhlcIRKvSP8AB-cglwV-7trkgGpVQqQoXOFuz9iRs2bWDvZuq_x5vE6rWl7HH7zwe229VvWd0KYQ2n7Y7DLjklIsl47wxtH5VnbOmkQtNOXv9bH5WXeZ8R8uQRggh31ZnQtWqqWtxXt1vcEpxtdnY9ebqM4FAMGQc2h7JED32ZIxkn9B3biRhlu5xP3YeM2khoycxECAuhkPsp7Erq1kEfQnjfUx_CtCTNUw5dwUdoDsguYIHzO-qN1voM16e4kX169v1z9X3xe2P9c3q6-3CycaMC9cAOKmUY4Kib71GhLpFyjQHxZQGpdtt67yjAnVtPAqvqaxpA0wxjyAuqpsj10e4s_vUDZAebITOPiZi2llIY-d6tJRrMz8S5cpJdMyYVgAY9MJpLx0W1pcjaz-1A3qHYSxXfQF9uRO633YXD5ZRyaQyvBA-nggp_p0wj3bossO-h4BxypbrmipOG66KlB-lLsWcE26fzmHUzg6wjw6wswPsyQGl6MPzDp9K_v-3-AcFRLAF</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2850720627</pqid></control><display><type>article</type><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><source>PubMed Central</source><creator>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</creator><creatorcontrib>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</creatorcontrib><description>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</description><identifier>ISSN: 1664-8021</identifier><identifier>EISSN: 1664-8021</identifier><identifier>DOI: 10.3389/fgene.2023.1226905</identifier><identifier>PMID: 37576553</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>convolution neural network ; Genetics ; neuropeptides ; one-hot ; stacking strategy ; word2vec</subject><ispartof>Frontiers in genetics, 2023-07, Vol.14, p.1226905-1226905</ispartof><rights>Copyright © 2023 Liu, Lin and Jia.</rights><rights>Copyright © 2023 Liu, Lin and Jia. 2023 Liu, Lin and Jia</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</citedby><cites>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37576553$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Di</creatorcontrib><creatorcontrib>Lin, Zhengkui</creatorcontrib><creatorcontrib>Jia, Cangzhi</creatorcontrib><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><title>Frontiers in genetics</title><addtitle>Front Genet</addtitle><description>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</description><subject>convolution neural network</subject><subject>Genetics</subject><subject>neuropeptides</subject><subject>one-hot</subject><subject>stacking strategy</subject><subject>word2vec</subject><issn>1664-8021</issn><issn>1664-8021</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNpVkk1vEzEQhlcIRKvSP8AB-cglwV-7trkgGpVQqQoXOFuz9iRs2bWDvZuq_x5vE6rWl7HH7zwe229VvWd0KYQ2n7Y7DLjklIsl47wxtH5VnbOmkQtNOXv9bH5WXeZ8R8uQRggh31ZnQtWqqWtxXt1vcEpxtdnY9ebqM4FAMGQc2h7JED32ZIxkn9B3biRhlu5xP3YeM2khoycxECAuhkPsp7Erq1kEfQnjfUx_CtCTNUw5dwUdoDsguYIHzO-qN1voM16e4kX169v1z9X3xe2P9c3q6-3CycaMC9cAOKmUY4Kib71GhLpFyjQHxZQGpdtt67yjAnVtPAqvqaxpA0wxjyAuqpsj10e4s_vUDZAebITOPiZi2llIY-d6tJRrMz8S5cpJdMyYVgAY9MJpLx0W1pcjaz-1A3qHYSxXfQF9uRO633YXD5ZRyaQyvBA-nggp_p0wj3bossO-h4BxypbrmipOG66KlB-lLsWcE26fzmHUzg6wjw6wswPsyQGl6MPzDp9K_v-3-AcFRLAF</recordid><startdate>20230727</startdate><enddate>20230727</enddate><creator>Liu, Di</creator><creator>Lin, Zhengkui</creator><creator>Jia, Cangzhi</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20230727</creationdate><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><author>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>convolution neural network</topic><topic>Genetics</topic><topic>neuropeptides</topic><topic>one-hot</topic><topic>stacking strategy</topic><topic>word2vec</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Di</creatorcontrib><creatorcontrib>Lin, Zhengkui</creatorcontrib><creatorcontrib>Jia, Cangzhi</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>Frontiers in genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Di</au><au>Lin, Zhengkui</au><au>Jia, Cangzhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</atitle><jtitle>Frontiers in genetics</jtitle><addtitle>Front Genet</addtitle><date>2023-07-27</date><risdate>2023</risdate><volume>14</volume><spage>1226905</spage><epage>1226905</epage><pages>1226905-1226905</pages><issn>1664-8021</issn><eissn>1664-8021</eissn><abstract>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>37576553</pmid><doi>10.3389/fgene.2023.1226905</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1664-8021 |
ispartof | Frontiers in genetics, 2023-07, Vol.14, p.1226905-1226905 |
issn | 1664-8021 1664-8021 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce |
source | PubMed Central |
subjects | convolution neural network Genetics neuropeptides one-hot stacking strategy word2vec |
title | NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T23%3A06%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NeuroCNN_GNB:%20an%20ensemble%20model%20to%20predict%20neuropeptides%20based%20on%20a%20convolution%20neural%20network%20and%20Gaussian%20naive%20Bayes&rft.jtitle=Frontiers%20in%20genetics&rft.au=Liu,%20Di&rft.date=2023-07-27&rft.volume=14&rft.spage=1226905&rft.epage=1226905&rft.pages=1226905-1226905&rft.issn=1664-8021&rft.eissn=1664-8021&rft_id=info:doi/10.3389/fgene.2023.1226905&rft_dat=%3Cproquest_doaj_%3E2850720627%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2850720627&rft_id=info:pmid/37576553&rfr_iscdi=true |