Loading…

DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle

While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseu...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in genetics 2019-03, Vol.10, p.143
Main Authors: Wang, Linyu, Liu, Yuanning, Zhong, Xiaodan, Liu, Haiming, Lu, Chao, Li, Cong, Zhang, Hao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753
cites cdi_FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753
container_end_page
container_issue
container_start_page 143
container_title Frontiers in genetics
container_volume 10
creator Wang, Linyu
Liu, Yuanning
Zhong, Xiaodan
Liu, Haiming
Lu, Chao
Li, Cong
Zhang, Hao
description While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, "DMfold," a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.
doi_str_mv 10.3389/fgene.2019.00143
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ef6efcfe687d412a9362765b1e96cf47</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_ef6efcfe687d412a9362765b1e96cf47</doaj_id><sourcerecordid>2194143027</sourcerecordid><originalsourceid>FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753</originalsourceid><addsrcrecordid>eNpVkktvEzEUhUeIilZt96yQl2wS_JjxzLBACi2PSEmJKIil5bGvE5eJPbU9FfAn-Ms4Salab2zZ93zH1z5F8ZLgKWNN-8aswcGUYtJOMSYle1acEM7LSYMpef5ofVycx3iD8yhbxlj5ojhmuGk4p_VJ8fdyaXyv36IZuvJ30KMlpI3XKHm0CqCtSujr1Qxdg_JOy_AbXacwqjQGQD9s2qBVhFH7n86niN7LCBp5hy4BBrQAGZx1aySdRvPtEDJe72vQStqAlvKX3do_MtmsWAXrlB16OCuOjOwjnN_Pp8X3jx--XXyeLL58ml_MFhNVcpomlGjWlbkFrA2hrMNAG1JTzJnmRDe6alkHWjGAVrGK0VZKDEybikqpaV2x02J-4Govb8QQ7DY3J7y0Yr_hw1rIkKzqQYDhYJQB3tS6JFS2LL8crzoCLVemrDPr3YE1jN02u4JLQfZPoE9PnN2Itb8TvMQtoyQDXt8Dgr8dISaxtVFB30sHfoyCkrbMH4zpzgsfSlXwMQYwDzYEi10sxD4WYhcLsY9Flrx6fL0Hwf8QsH9w9rWu</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2194143027</pqid></control><display><type>article</type><title>DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle</title><source>PubMed Central</source><creator>Wang, Linyu ; Liu, Yuanning ; Zhong, Xiaodan ; Liu, Haiming ; Lu, Chao ; Li, Cong ; Zhang, Hao</creator><creatorcontrib>Wang, Linyu ; Liu, Yuanning ; Zhong, Xiaodan ; Liu, Haiming ; Lu, Chao ; Li, Cong ; Zhang, Hao</creatorcontrib><description>While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, "DMfold," a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.</description><identifier>ISSN: 1664-8021</identifier><identifier>EISSN: 1664-8021</identifier><identifier>DOI: 10.3389/fgene.2019.00143</identifier><identifier>PMID: 30886627</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>deep learning ; Genetics ; multi-sequence method ; pseudoknot ; RNA ; secondary structure prediction ; single-sequence method</subject><ispartof>Frontiers in genetics, 2019-03, Vol.10, p.143</ispartof><rights>Copyright © 2019 Wang, Liu, Zhong, Liu, Lu, Li and Zhang. 2019 Wang, Liu, Zhong, Liu, Lu, Li and Zhang</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753</citedby><cites>FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409321/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409321/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30886627$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Linyu</creatorcontrib><creatorcontrib>Liu, Yuanning</creatorcontrib><creatorcontrib>Zhong, Xiaodan</creatorcontrib><creatorcontrib>Liu, Haiming</creatorcontrib><creatorcontrib>Lu, Chao</creatorcontrib><creatorcontrib>Li, Cong</creatorcontrib><creatorcontrib>Zhang, Hao</creatorcontrib><title>DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle</title><title>Frontiers in genetics</title><addtitle>Front Genet</addtitle><description>While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, "DMfold," a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.</description><subject>deep learning</subject><subject>Genetics</subject><subject>multi-sequence method</subject><subject>pseudoknot</subject><subject>RNA</subject><subject>secondary structure prediction</subject><subject>single-sequence method</subject><issn>1664-8021</issn><issn>1664-8021</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNpVkktvEzEUhUeIilZt96yQl2wS_JjxzLBACi2PSEmJKIil5bGvE5eJPbU9FfAn-Ms4Salab2zZ93zH1z5F8ZLgKWNN-8aswcGUYtJOMSYle1acEM7LSYMpef5ofVycx3iD8yhbxlj5ojhmuGk4p_VJ8fdyaXyv36IZuvJ30KMlpI3XKHm0CqCtSujr1Qxdg_JOy_AbXacwqjQGQD9s2qBVhFH7n86niN7LCBp5hy4BBrQAGZx1aySdRvPtEDJe72vQStqAlvKX3do_MtmsWAXrlB16OCuOjOwjnN_Pp8X3jx--XXyeLL58ml_MFhNVcpomlGjWlbkFrA2hrMNAG1JTzJnmRDe6alkHWjGAVrGK0VZKDEybikqpaV2x02J-4Govb8QQ7DY3J7y0Yr_hw1rIkKzqQYDhYJQB3tS6JFS2LL8crzoCLVemrDPr3YE1jN02u4JLQfZPoE9PnN2Itb8TvMQtoyQDXt8Dgr8dISaxtVFB30sHfoyCkrbMH4zpzgsfSlXwMQYwDzYEi10sxD4WYhcLsY9Flrx6fL0Hwf8QsH9w9rWu</recordid><startdate>20190304</startdate><enddate>20190304</enddate><creator>Wang, Linyu</creator><creator>Liu, Yuanning</creator><creator>Zhong, Xiaodan</creator><creator>Liu, Haiming</creator><creator>Lu, Chao</creator><creator>Li, Cong</creator><creator>Zhang, Hao</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20190304</creationdate><title>DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle</title><author>Wang, Linyu ; Liu, Yuanning ; Zhong, Xiaodan ; Liu, Haiming ; Lu, Chao ; Li, Cong ; Zhang, Hao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>deep learning</topic><topic>Genetics</topic><topic>multi-sequence method</topic><topic>pseudoknot</topic><topic>RNA</topic><topic>secondary structure prediction</topic><topic>single-sequence method</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Linyu</creatorcontrib><creatorcontrib>Liu, Yuanning</creatorcontrib><creatorcontrib>Zhong, Xiaodan</creatorcontrib><creatorcontrib>Liu, Haiming</creatorcontrib><creatorcontrib>Lu, Chao</creatorcontrib><creatorcontrib>Li, Cong</creatorcontrib><creatorcontrib>Zhang, Hao</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Frontiers in genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Linyu</au><au>Liu, Yuanning</au><au>Zhong, Xiaodan</au><au>Liu, Haiming</au><au>Lu, Chao</au><au>Li, Cong</au><au>Zhang, Hao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle</atitle><jtitle>Frontiers in genetics</jtitle><addtitle>Front Genet</addtitle><date>2019-03-04</date><risdate>2019</risdate><volume>10</volume><spage>143</spage><pages>143-</pages><issn>1664-8021</issn><eissn>1664-8021</eissn><abstract>While predicting the secondary structure of RNA is vital for researching its function, determining RNA secondary structure is challenging, especially for that with pseudoknots. Typically, several excellent computational methods can be utilized to predict the secondary structure (with or without pseudoknots), but they have their own merits and demerits. These methods can be classified into two categories: the multi-sequence method and the single-sequence method. The main advantage of the multi-sequence method lies in its use of the auxiliary sequences to assist in predicting the secondary structure, but it can only successfully predict in the presence of multiple highly homologous sequences. The single-sequence method is associated with the major merit of easy operation (only need the target sequence to predict secondary structure), but its folding parameters are the common features of diversity RNA, which cannot describe the unique characteristics of RNA, thus potentially resulting in the low prediction accuracy in some RNA. In this paper, "DMfold," a method based on the Deep Learning and Improved Base Pair Maximization Principle, is proposed to predict the secondary structure with pseudoknots, which fully absorbs the advantages and avoids some disadvantages of those two methods. Notably, DMfold could predict the secondary structure of RNA by learning similar RNA in the known structures, which uses the similar RNA sequences instead of the highly homogeneous sequences in the multi-sequence method, thereby reducing the requirement for auxiliary sequences. In DMfold, it only needs to input the target sequence to predict the secondary structure. Its folding parameters are fully extracted automatically by deep learning, which could avoid the lack of folding parameters in the single-sequence method. Experiments show that our method is not only simple to operate, but also improves the prediction accuracy compared to multiple excellent prediction methods. A repository containing our code can be found at https://github.com/linyuwangPHD/RNA-Secondary-Structure-Database.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>30886627</pmid><doi>10.3389/fgene.2019.00143</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1664-8021
ispartof Frontiers in genetics, 2019-03, Vol.10, p.143
issn 1664-8021
1664-8021
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_ef6efcfe687d412a9362765b1e96cf47
source PubMed Central
subjects deep learning
Genetics
multi-sequence method
pseudoknot
RNA
secondary structure prediction
single-sequence method
title DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T12%3A57%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DMfold:%20A%20Novel%20Method%20to%20Predict%20RNA%20Secondary%20Structure%20With%20Pseudoknots%20Based%20on%20Deep%20Learning%20and%20Improved%20Base%20Pair%20Maximization%20Principle&rft.jtitle=Frontiers%20in%20genetics&rft.au=Wang,%20Linyu&rft.date=2019-03-04&rft.volume=10&rft.spage=143&rft.pages=143-&rft.issn=1664-8021&rft.eissn=1664-8021&rft_id=info:doi/10.3389/fgene.2019.00143&rft_dat=%3Cproquest_doaj_%3E2194143027%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c462t-21d3b46620df123b0e28172063d61d8d593bedc3ee9c35329aa0e3df52aad2753%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2194143027&rft_id=info:pmid/30886627&rfr_iscdi=true