Loading…

DNA Methylation Prediction Using Reduced Features Obtained via Gappy Pair Kernel and Partial Least Square

It is critical to correctly identify DNA methylation because it has been linked to a variety of human disorders, particularly cancer. DNA methylation is an epigenetic process that allows cells to alter gene expression. This work deals with a type of DNA methylation called 5-methyl cytosine (m5c), in...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2022, Vol.10, p.53265-53274
Main Authors: Shah, Sajid, Rahman, Altaf Ur, Jabeen, Saima, Khan, Ahmad, Khan, Fiaz Gul, Elaffendi, Mohammed
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:It is critical to correctly identify DNA methylation because it has been linked to a variety of human disorders, particularly cancer. DNA methylation is an epigenetic process that allows cells to alter gene expression. This work deals with a type of DNA methylation called 5-methyl cytosine (m5c), in which the methyl group ( CH_{3} ) is attached to the 5^{th} carbon of cytosine. The performances of different machine learning algorithms used for methylation identification are greatly degraded due to poor representation of input sequential data. In the current work, we have proposed a classification model that is based on the extraction of high differentiating features from the sample sequences using gappy pair kernel. Increasing the number of features to better represent a sequence leads to the curse of dimensionality, which is handled by a dimensionality reduction technique called PLS (Partial Least Square). The obtained features are then subjected to multiple classifiers to test the discriminating power of these features. Results are computed for cross species i.e human and mouse, to check the robustness of our proposed model. Finally, the obtained results are compared in terms of sensitivity, specificity, and accuracy with the state-of-the-art approaches. Our proposed approach has outperformed state-of-the-art techniques in all three metrics for both datasets. For research community to test our technique, we have uploaded our code on github ( https://github.com/sajidshahbs/gappypairKernel_Rcode ).
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3174260