Loading…

PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs

Abstract Motivation Different from traditional linear RNAs (containing 5′ and 3′ ends), circular RNAs (circRNAs) are a special type of RNAs that have a closed ring structure. Accumulating evidence has indicated that circRNAs can directly bind proteins and participate in a myriad of different biologi...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2020-08, Vol.36 (15), p.4276-4282
Main Authors: Jia, Cangzhi, Bi, Yue, Chen, Jinxiang, Leier, André, Li, Fuyi, Song, Jiangning
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923
cites cdi_FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923
container_end_page 4282
container_issue 15
container_start_page 4276
container_title Bioinformatics
container_volume 36
creator Jia, Cangzhi
Bi, Yue
Chen, Jinxiang
Leier, André
Li, Fuyi
Song, Jiangning
description Abstract Motivation Different from traditional linear RNAs (containing 5′ and 3′ ends), circular RNAs (circRNAs) are a special type of RNAs that have a closed ring structure. Accumulating evidence has indicated that circRNAs can directly bind proteins and participate in a myriad of different biological processes. Results For identifying the interaction of circRNAs with 37 different types of circRNA-binding proteins (RBPs), we develop an ensemble neural network, termed PASSION, which is based on the concatenated artificial neural network (ANN) and hybrid deep neural network frameworks. Specifically, the input of the ANN is the optimal feature subset for each RBP, which has been selected from six types of feature encoding schemes through incremental feature selection and application of the XGBoost algorithm. In turn, the input of the hybrid deep neural network is a stacked codon-based scheme. Benchmarking experiments indicate that the ensemble neural network reaches the average best area under the curve (AUC) of 0.883 across the 37 circRNA datasets when compared with XGBoost, k-nearest neighbor, support vector machine, random forest, logistic regression and Naive Bayes. Moreover, each of the 37 RBP models is extensively tested by performing independent tests, with the varying sequence similarity thresholds of 0.8, 0.7, 0.6 and 0.5, respectively. The corresponding average AUC obtained are 0.883, 0.876, 0.868 and 0.883, respectively, highlighting the effectiveness and robustness of PASSION. Extensive benchmarking experiments demonstrate that PASSION achieves a competitive performance for identifying binding sites between circRNA and RBPs, when compared with several state-of-the-art methods. Availability and implementation A user-friendly web server of PASSION is publicly accessible at http://flagship.erc.monash.edu/PASSION/. Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btaa522
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2404642692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btaa522</oup_id><sourcerecordid>2404642692</sourcerecordid><originalsourceid>FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923</originalsourceid><addsrcrecordid>eNqNUMtOwzAQtBCIR-EXKh-5hPqVNOZWEI9KVVu1cI5sxwZD4gQ7Eerf49KCxI3TjFYzs7sDwBCjK4w4HUnbWGcaX4vOqjCSnRApIQfgFLMMJQSl_DBymo0TliN6As5CeEMoxYyxY3BCCSNZjvNTIJeT9Xq6mF9D4aB2Qdey0tDp3osqQvfZ-Hco2tY3Qr3CuBDaUrvOmo11L7B71VBaV255sJ0OsDFwdbOM6KCyXq3mk3AOjoyogr7Y4wA839893T4ms8XD9HYySxRDuEuMEoZKmeeKlKXEKiWME04pTw1BGcvGXOYoFZoaJMuMG8FJHgdbTkrCCR2Ay11uPPaj16ErahuUrirhdNOHgjDEsvj3tzTbSZVvQvDaFK23tfCbAqNi22_xt99i3280Dvc7elnr8tf2U2gU4J2g6dv_hn4BHjONtQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2404642692</pqid></control><display><type>article</type><title>PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs</title><source>OUP_牛津大学出版社OA刊</source><source>NCBI_PubMed Central(免费)</source><creator>Jia, Cangzhi ; Bi, Yue ; Chen, Jinxiang ; Leier, André ; Li, Fuyi ; Song, Jiangning</creator><contributor>Mathelier, Anthony</contributor><creatorcontrib>Jia, Cangzhi ; Bi, Yue ; Chen, Jinxiang ; Leier, André ; Li, Fuyi ; Song, Jiangning ; Mathelier, Anthony</creatorcontrib><description>Abstract Motivation Different from traditional linear RNAs (containing 5′ and 3′ ends), circular RNAs (circRNAs) are a special type of RNAs that have a closed ring structure. Accumulating evidence has indicated that circRNAs can directly bind proteins and participate in a myriad of different biological processes. Results For identifying the interaction of circRNAs with 37 different types of circRNA-binding proteins (RBPs), we develop an ensemble neural network, termed PASSION, which is based on the concatenated artificial neural network (ANN) and hybrid deep neural network frameworks. Specifically, the input of the ANN is the optimal feature subset for each RBP, which has been selected from six types of feature encoding schemes through incremental feature selection and application of the XGBoost algorithm. In turn, the input of the hybrid deep neural network is a stacked codon-based scheme. Benchmarking experiments indicate that the ensemble neural network reaches the average best area under the curve (AUC) of 0.883 across the 37 circRNA datasets when compared with XGBoost, k-nearest neighbor, support vector machine, random forest, logistic regression and Naive Bayes. Moreover, each of the 37 RBP models is extensively tested by performing independent tests, with the varying sequence similarity thresholds of 0.8, 0.7, 0.6 and 0.5, respectively. The corresponding average AUC obtained are 0.883, 0.876, 0.868 and 0.883, respectively, highlighting the effectiveness and robustness of PASSION. Extensive benchmarking experiments demonstrate that PASSION achieves a competitive performance for identifying binding sites between circRNA and RBPs, when compared with several state-of-the-art methods. Availability and implementation A user-friendly web server of PASSION is publicly accessible at http://flagship.erc.monash.edu/PASSION/. Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btaa522</identifier><identifier>PMID: 32426818</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Bayes Theorem ; Binding Sites ; Neural Networks, Computer ; RNA, Circular ; RNA-Binding Proteins - metabolism</subject><ispartof>Bioinformatics, 2020-08, Vol.36 (15), p.4276-4282</ispartof><rights>The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2020</rights><rights>The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923</citedby><cites>FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923</cites><orcidid>0000-0001-8031-9086 ; 0000-0001-5216-3213</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32426818$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Mathelier, Anthony</contributor><creatorcontrib>Jia, Cangzhi</creatorcontrib><creatorcontrib>Bi, Yue</creatorcontrib><creatorcontrib>Chen, Jinxiang</creatorcontrib><creatorcontrib>Leier, André</creatorcontrib><creatorcontrib>Li, Fuyi</creatorcontrib><creatorcontrib>Song, Jiangning</creatorcontrib><title>PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Different from traditional linear RNAs (containing 5′ and 3′ ends), circular RNAs (circRNAs) are a special type of RNAs that have a closed ring structure. Accumulating evidence has indicated that circRNAs can directly bind proteins and participate in a myriad of different biological processes. Results For identifying the interaction of circRNAs with 37 different types of circRNA-binding proteins (RBPs), we develop an ensemble neural network, termed PASSION, which is based on the concatenated artificial neural network (ANN) and hybrid deep neural network frameworks. Specifically, the input of the ANN is the optimal feature subset for each RBP, which has been selected from six types of feature encoding schemes through incremental feature selection and application of the XGBoost algorithm. In turn, the input of the hybrid deep neural network is a stacked codon-based scheme. Benchmarking experiments indicate that the ensemble neural network reaches the average best area under the curve (AUC) of 0.883 across the 37 circRNA datasets when compared with XGBoost, k-nearest neighbor, support vector machine, random forest, logistic regression and Naive Bayes. Moreover, each of the 37 RBP models is extensively tested by performing independent tests, with the varying sequence similarity thresholds of 0.8, 0.7, 0.6 and 0.5, respectively. The corresponding average AUC obtained are 0.883, 0.876, 0.868 and 0.883, respectively, highlighting the effectiveness and robustness of PASSION. Extensive benchmarking experiments demonstrate that PASSION achieves a competitive performance for identifying binding sites between circRNA and RBPs, when compared with several state-of-the-art methods. Availability and implementation A user-friendly web server of PASSION is publicly accessible at http://flagship.erc.monash.edu/PASSION/. Supplementary information Supplementary data are available at Bioinformatics online.</description><subject>Bayes Theorem</subject><subject>Binding Sites</subject><subject>Neural Networks, Computer</subject><subject>RNA, Circular</subject><subject>RNA-Binding Proteins - metabolism</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqNUMtOwzAQtBCIR-EXKh-5hPqVNOZWEI9KVVu1cI5sxwZD4gQ7Eerf49KCxI3TjFYzs7sDwBCjK4w4HUnbWGcaX4vOqjCSnRApIQfgFLMMJQSl_DBymo0TliN6As5CeEMoxYyxY3BCCSNZjvNTIJeT9Xq6mF9D4aB2Qdey0tDp3osqQvfZ-Hco2tY3Qr3CuBDaUrvOmo11L7B71VBaV255sJ0OsDFwdbOM6KCyXq3mk3AOjoyogr7Y4wA839893T4ms8XD9HYySxRDuEuMEoZKmeeKlKXEKiWME04pTw1BGcvGXOYoFZoaJMuMG8FJHgdbTkrCCR2Ay11uPPaj16ErahuUrirhdNOHgjDEsvj3tzTbSZVvQvDaFK23tfCbAqNi22_xt99i3280Dvc7elnr8tf2U2gU4J2g6dv_hn4BHjONtQ</recordid><startdate>20200801</startdate><enddate>20200801</enddate><creator>Jia, Cangzhi</creator><creator>Bi, Yue</creator><creator>Chen, Jinxiang</creator><creator>Leier, André</creator><creator>Li, Fuyi</creator><creator>Song, Jiangning</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8031-9086</orcidid><orcidid>https://orcid.org/0000-0001-5216-3213</orcidid></search><sort><creationdate>20200801</creationdate><title>PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs</title><author>Jia, Cangzhi ; Bi, Yue ; Chen, Jinxiang ; Leier, André ; Li, Fuyi ; Song, Jiangning</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Bayes Theorem</topic><topic>Binding Sites</topic><topic>Neural Networks, Computer</topic><topic>RNA, Circular</topic><topic>RNA-Binding Proteins - metabolism</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jia, Cangzhi</creatorcontrib><creatorcontrib>Bi, Yue</creatorcontrib><creatorcontrib>Chen, Jinxiang</creatorcontrib><creatorcontrib>Leier, André</creatorcontrib><creatorcontrib>Li, Fuyi</creatorcontrib><creatorcontrib>Song, Jiangning</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jia, Cangzhi</au><au>Bi, Yue</au><au>Chen, Jinxiang</au><au>Leier, André</au><au>Li, Fuyi</au><au>Song, Jiangning</au><au>Mathelier, Anthony</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2020-08-01</date><risdate>2020</risdate><volume>36</volume><issue>15</issue><spage>4276</spage><epage>4282</epage><pages>4276-4282</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Different from traditional linear RNAs (containing 5′ and 3′ ends), circular RNAs (circRNAs) are a special type of RNAs that have a closed ring structure. Accumulating evidence has indicated that circRNAs can directly bind proteins and participate in a myriad of different biological processes. Results For identifying the interaction of circRNAs with 37 different types of circRNA-binding proteins (RBPs), we develop an ensemble neural network, termed PASSION, which is based on the concatenated artificial neural network (ANN) and hybrid deep neural network frameworks. Specifically, the input of the ANN is the optimal feature subset for each RBP, which has been selected from six types of feature encoding schemes through incremental feature selection and application of the XGBoost algorithm. In turn, the input of the hybrid deep neural network is a stacked codon-based scheme. Benchmarking experiments indicate that the ensemble neural network reaches the average best area under the curve (AUC) of 0.883 across the 37 circRNA datasets when compared with XGBoost, k-nearest neighbor, support vector machine, random forest, logistic regression and Naive Bayes. Moreover, each of the 37 RBP models is extensively tested by performing independent tests, with the varying sequence similarity thresholds of 0.8, 0.7, 0.6 and 0.5, respectively. The corresponding average AUC obtained are 0.883, 0.876, 0.868 and 0.883, respectively, highlighting the effectiveness and robustness of PASSION. Extensive benchmarking experiments demonstrate that PASSION achieves a competitive performance for identifying binding sites between circRNA and RBPs, when compared with several state-of-the-art methods. Availability and implementation A user-friendly web server of PASSION is publicly accessible at http://flagship.erc.monash.edu/PASSION/. Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>32426818</pmid><doi>10.1093/bioinformatics/btaa522</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0001-8031-9086</orcidid><orcidid>https://orcid.org/0000-0001-5216-3213</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2020-08, Vol.36 (15), p.4276-4282
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_2404642692
source OUP_牛津大学出版社OA刊; NCBI_PubMed Central(免费)
subjects Bayes Theorem
Binding Sites
Neural Networks, Computer
RNA, Circular
RNA-Binding Proteins - metabolism
title PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T07%3A47%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PASSION:%20an%20ensemble%20neural%20network%20approach%20for%20identifying%20the%20binding%20sites%20of%20RBPs%20on%20circRNAs&rft.jtitle=Bioinformatics&rft.au=Jia,%20Cangzhi&rft.date=2020-08-01&rft.volume=36&rft.issue=15&rft.spage=4276&rft.epage=4282&rft.pages=4276-4282&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btaa522&rft_dat=%3Cproquest_cross%3E2404642692%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c401t-fcaf3bb88c2ddb1c5249293395f2064679b805ae3f0bd69fa928805bd692d2923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2404642692&rft_id=info:pmid/32426818&rft_oup_id=10.1093/bioinformatics/btaa522&rfr_iscdi=true