Loading…

NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes

Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, su...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in genetics 2023-07, Vol.14, p.1226905-1226905
Main Authors: Liu, Di, Lin, Zhengkui, Jia, Cangzhi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3
cites cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3
container_end_page 1226905
container_issue
container_start_page 1226905
container_title Frontiers in genetics
container_volume 14
creator Liu, Di
Lin, Zhengkui
Jia, Cangzhi
description Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.
doi_str_mv 10.3389/fgene.2023.1226905
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce</doaj_id><sourcerecordid>2850720627</sourcerecordid><originalsourceid>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</originalsourceid><addsrcrecordid>eNpVkk1vEzEQhlcIRKvSP8AB-cglwV-7trkgGpVQqQoXOFuz9iRs2bWDvZuq_x5vE6rWl7HH7zwe229VvWd0KYQ2n7Y7DLjklIsl47wxtH5VnbOmkQtNOXv9bH5WXeZ8R8uQRggh31ZnQtWqqWtxXt1vcEpxtdnY9ebqM4FAMGQc2h7JED32ZIxkn9B3biRhlu5xP3YeM2khoycxECAuhkPsp7Erq1kEfQnjfUx_CtCTNUw5dwUdoDsguYIHzO-qN1voM16e4kX169v1z9X3xe2P9c3q6-3CycaMC9cAOKmUY4Kib71GhLpFyjQHxZQGpdtt67yjAnVtPAqvqaxpA0wxjyAuqpsj10e4s_vUDZAebITOPiZi2llIY-d6tJRrMz8S5cpJdMyYVgAY9MJpLx0W1pcjaz-1A3qHYSxXfQF9uRO633YXD5ZRyaQyvBA-nggp_p0wj3bossO-h4BxypbrmipOG66KlB-lLsWcE26fzmHUzg6wjw6wswPsyQGl6MPzDp9K_v-3-AcFRLAF</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2850720627</pqid></control><display><type>article</type><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><source>PubMed Central</source><creator>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</creator><creatorcontrib>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</creatorcontrib><description>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</description><identifier>ISSN: 1664-8021</identifier><identifier>EISSN: 1664-8021</identifier><identifier>DOI: 10.3389/fgene.2023.1226905</identifier><identifier>PMID: 37576553</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>convolution neural network ; Genetics ; neuropeptides ; one-hot ; stacking strategy ; word2vec</subject><ispartof>Frontiers in genetics, 2023-07, Vol.14, p.1226905-1226905</ispartof><rights>Copyright © 2023 Liu, Lin and Jia.</rights><rights>Copyright © 2023 Liu, Lin and Jia. 2023 Liu, Lin and Jia</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</citedby><cites>FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414792/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37576553$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Di</creatorcontrib><creatorcontrib>Lin, Zhengkui</creatorcontrib><creatorcontrib>Jia, Cangzhi</creatorcontrib><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><title>Frontiers in genetics</title><addtitle>Front Genet</addtitle><description>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</description><subject>convolution neural network</subject><subject>Genetics</subject><subject>neuropeptides</subject><subject>one-hot</subject><subject>stacking strategy</subject><subject>word2vec</subject><issn>1664-8021</issn><issn>1664-8021</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNpVkk1vEzEQhlcIRKvSP8AB-cglwV-7trkgGpVQqQoXOFuz9iRs2bWDvZuq_x5vE6rWl7HH7zwe229VvWd0KYQ2n7Y7DLjklIsl47wxtH5VnbOmkQtNOXv9bH5WXeZ8R8uQRggh31ZnQtWqqWtxXt1vcEpxtdnY9ebqM4FAMGQc2h7JED32ZIxkn9B3biRhlu5xP3YeM2khoycxECAuhkPsp7Erq1kEfQnjfUx_CtCTNUw5dwUdoDsguYIHzO-qN1voM16e4kX169v1z9X3xe2P9c3q6-3CycaMC9cAOKmUY4Kib71GhLpFyjQHxZQGpdtt67yjAnVtPAqvqaxpA0wxjyAuqpsj10e4s_vUDZAebITOPiZi2llIY-d6tJRrMz8S5cpJdMyYVgAY9MJpLx0W1pcjaz-1A3qHYSxXfQF9uRO633YXD5ZRyaQyvBA-nggp_p0wj3bossO-h4BxypbrmipOG66KlB-lLsWcE26fzmHUzg6wjw6wswPsyQGl6MPzDp9K_v-3-AcFRLAF</recordid><startdate>20230727</startdate><enddate>20230727</enddate><creator>Liu, Di</creator><creator>Lin, Zhengkui</creator><creator>Jia, Cangzhi</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20230727</creationdate><title>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</title><author>Liu, Di ; Lin, Zhengkui ; Jia, Cangzhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>convolution neural network</topic><topic>Genetics</topic><topic>neuropeptides</topic><topic>one-hot</topic><topic>stacking strategy</topic><topic>word2vec</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Di</creatorcontrib><creatorcontrib>Lin, Zhengkui</creatorcontrib><creatorcontrib>Jia, Cangzhi</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>Frontiers in genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Di</au><au>Lin, Zhengkui</au><au>Jia, Cangzhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes</atitle><jtitle>Frontiers in genetics</jtitle><addtitle>Front Genet</addtitle><date>2023-07-27</date><risdate>2023</risdate><volume>14</volume><spage>1226905</spage><epage>1226905</epage><pages>1226905-1226905</pages><issn>1664-8021</issn><eissn>1664-8021</eissn><abstract>Neuropeptides contain more chemical information than other classical neurotransmitters and have multiple receptor recognition sites. These characteristics allow neuropeptides to have a correspondingly higher selectivity for nerve receptors and fewer side effects. Traditional experimental methods, such as mass spectrometry and liquid chromatography technology, still need the support of a complete neuropeptide precursor database and the basic characteristics of neuropeptides. Incomplete neuropeptide precursor and information databases will lead to false-positives or reduce the sensitivity of recognition. In recent years, studies have proven that machine learning methods can rapidly and effectively predict neuropeptides. In this work, we have made a systematic attempt to create an ensemble tool based on four convolution neural network models. These baseline models were separately trained on one-hot encoding, AAIndex, G-gap dipeptide encoding and word2vec and integrated using Gaussian Naive Bayes (NB) to construct our predictor designated NeuroCNN_GNB. Both 5-fold cross-validation tests using benchmark datasets and independent tests showed that NeuroCNN_GNB outperformed other state-of-the-art methods. Furthermore, this novel framework provides essential interpretations that aid the understanding of model success by leveraging the powerful Shapley Additive exPlanation (SHAP) algorithm, thereby highlighting the most important features relevant for predicting neuropeptides.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>37576553</pmid><doi>10.3389/fgene.2023.1226905</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1664-8021
ispartof Frontiers in genetics, 2023-07, Vol.14, p.1226905-1226905
issn 1664-8021
1664-8021
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_02890004027c4ec199b3aa9ed3c8d4ce
source PubMed Central
subjects convolution neural network
Genetics
neuropeptides
one-hot
stacking strategy
word2vec
title NeuroCNN_GNB: an ensemble model to predict neuropeptides based on a convolution neural network and Gaussian naive Bayes
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T23%3A06%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NeuroCNN_GNB:%20an%20ensemble%20model%20to%20predict%20neuropeptides%20based%20on%20a%20convolution%20neural%20network%20and%20Gaussian%20naive%20Bayes&rft.jtitle=Frontiers%20in%20genetics&rft.au=Liu,%20Di&rft.date=2023-07-27&rft.volume=14&rft.spage=1226905&rft.epage=1226905&rft.pages=1226905-1226905&rft.issn=1664-8021&rft.eissn=1664-8021&rft_id=info:doi/10.3389/fgene.2023.1226905&rft_dat=%3Cproquest_doaj_%3E2850720627%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c469t-c6aac477c130edbd8eea5be0182a7178a78bfbcdc03e859de3d804506a171dea3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2850720627&rft_id=info:pmid/37576553&rfr_iscdi=true