Loading…

A Heterogeneous Information Network Model for Long Non-Coding RNA Function Prediction

Exciting information on the functional roles played by long non-coding RNA (lncRNA) has drawn substantial research attention these days. With the advent of techniques such as RNA-Seq, thousands of lncRNAs are identified in very short time spans. However, due to the poor annotation rate, only a few o...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on computational biology and bioinformatics 2022-01, Vol.19 (1), p.255-266
Main Authors: V, Sunil Kumar P, Thahsin, Adheeba, M, Manju, G, Gopakumar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Exciting information on the functional roles played by long non-coding RNA (lncRNA) has drawn substantial research attention these days. With the advent of techniques such as RNA-Seq, thousands of lncRNAs are identified in very short time spans. However, due to the poor annotation rate, only a few of them are functionally characterised. The wet lab experiments to elucidate lncRNA functions are challenging, slow progressing and sometimes prohibitively expensive. This work attempts to solve the crucial problem of developing computational methods to predict lncRNA functions. The model presented here, predicts the functions of lncRNAs by making use of a meta-path based measure, AvgSim on a Heterogeneous Information Network (HIN). The network is constructed from existing protein and function association data of lncRNAs, lncRNA co-expression data and protein protein interaction data. Out of the 2,758 lncRNA considered for the experiment, the proposed method predicts possible functions for 2,695 lncRNAs with an accuracy of 73.68 percent and found to perform better than the other state-of-the-art approaches for an independent test set. A case study of two well-known lncRNAs (HOTAIR and H19) is conducted and the associated functions are identified. The results were validated using experimental evidence from the literature. The script and data used for the implementation of the model is freely available at: http://bdbl.nitc.ac.in/LncFunPred/index.html .
ISSN:1545-5963
1557-9964
DOI:10.1109/TCBB.2020.3000518