Loading…

HumDLoc: Human Protein Subcellular Localization Prediction Using Deep Neural Network

Aims: To develop a tool that can annotate subcellular localization of human proteins. Background: With the progression of high throughput human proteomics projects, an enormous amount of protein sequence data has been discovered in the recent past. All these raw sequence data require precise mapping...

Full description

Saved in:
Bibliographic Details
Published in:Current genomics 2020-11, Vol.21 (7), p.546-557
Main Authors: Semwal, Rahul, Varadwaj, Pritish Kumar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aims: To develop a tool that can annotate subcellular localization of human proteins. Background: With the progression of high throughput human proteomics projects, an enormous amount of protein sequence data has been discovered in the recent past. All these raw sequence data require precise mapping and annotation for their respective biological role and functional attributes. The functional characteristics of protein molecules are highly dependent on the subcellular localization/ compartment. Therefore, a fully automated and reliable protein subcellular localization prediction system would be very useful for current proteomic research. Objective: To develop a machine learning-based predictive model that can annotate the subcellular localization of human proteins with high accuracy and precision. Methods: In this study, we used the PSI-CD-HIT homology criterion and utilized the sequence-based features of protein sequences to develop a powerful subcellular localization predictive model. The dataset used to train the HumDLoc model was extracted from a reliable data source, Uniprot knowledge base, which helps the model to generalize on the unseen dataset. Results: The proposed model, HumDLoc, was compared with two of the most widely used techniques: CELLO and DeepLoc, and other machine learning-based tools. The result demonstrated promising predictive performance of HumDLoc model based on various machine learning parameters such as accuracy (≥97.00%), precision (≥0.86), recall (≥0.89), MCC score (≥0.86), ROC curve (0.98 square unit), and precision-recall curve (0.93 square unit). Conclusion: In conclusion, HumDLoc was able to outperform several alternative tools for correctly predicting subcellular localization of human proteins. The HumDLoc has been hosted as a web-based tool at https://bioserver.iiita.ac.in/HumDLoc/.
ISSN:1389-2029
1875-5488
DOI:10.2174/1389202921999200528160534