Loading…

M6A-GSMS: Computational identification of N6-methyladenosine sites with GBDT and stacking learning in multiple species

N 6 -methyladenosine (m 6 A) is one of the most abundant forms of RNA methylation modifications currently known. It involves a wide range of biological processes, including degradation, stability, alternative splicing, etc. Therefore, the development of convenient and efficient m 6 A prediction tech...

Full description

Saved in:
Bibliographic Details
Published in:Journal of biomolecular structure & dynamics 2022-01, Vol.40 (22), p.12380-12391
Main Authors: Zhang, Shengli, Wang, Jinyue, Li, Xinjie, Liang, Yunyun
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:N 6 -methyladenosine (m 6 A) is one of the most abundant forms of RNA methylation modifications currently known. It involves a wide range of biological processes, including degradation, stability, alternative splicing, etc. Therefore, the development of convenient and efficient m 6 A prediction technologies are urgent. In this work, a novel predictor based on GBDT and stacking learning is developed to identify m 6 A sites, which is called M6A-GSMS. To achieve accurate prediction, we explore RNA sequence information from four aspects: correlation, structure, physicochemical properties and pseudo ribonucleic acid composition. After using the GBDT algorithm for feature selection, a stacking model is constructed by combining seven basic classifiers. Compared with other state-of-the-art methods, the results show that M6A-GSMS can obtain excellent performance for identifying the m 6 A sites. The prediction accuracy of A.thaliana, D.melanogaster, M.musculus, S.cerevisiae and Human reaches 88.4%, 60.8%, 80.5%, 92.4% and 61.8%, respectively. This method provides an effective prediction for the investigation of m 6 A sites. In addition, all the datasets and codes are currently available at https://github.com/Wang-Jinyue/M6A-GSMS . Communicated by Ramaswamy H. Sarma
ISSN:0739-1102
1538-0254
DOI:10.1080/07391102.2021.1970628