Loading…

Sequence-based prediction of protein-peptide binding sites using support vector machine

Protein–peptide interactions are essential for all cellular processes including DNA repair, replication, gene‐expression, and metabolism. As most protein–peptide interactions are uncharacterized, it is cost effective to investigate them computationally as the first step. All existing approaches for...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational chemistry 2016-05, Vol.37 (13), p.1223-1229
Main Authors: Taherzadeh, Ghazaleh, Yang, Yuedong, Zhang, Tuo, Liew, Alan Wee-Chung, Zhou, Yaoqi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Protein–peptide interactions are essential for all cellular processes including DNA repair, replication, gene‐expression, and metabolism. As most protein–peptide interactions are uncharacterized, it is cost effective to investigate them computationally as the first step. All existing approaches for predicting protein–peptide binding sites, however, are based on protein structures despite the fact that the structures for most proteins are not yet solved. This article proposes the first machine‐learning method called SPRINT to make Sequence‐based prediction of Protein–peptide Residue‐level Interactions. SPRINT yields a robust and consistent performance for 10‐fold cross validations and independent test. The most important feature is evolution‐generated sequence profiles. For the test set (1056 binding and non‐binding residues), it yields a Matthews’ Correlation Coefficient of 0.326 with a sensitivity of 64% and a specificity of 68%. This sequence‐based technique shows comparable or more accurate than structure‐based methods for peptide‐binding site prediction. SPRINT is available as an online server at: http://sparks-lab.org/. © 2016 Wiley Periodicals, Inc. Protein–peptide interactions play vital roles in cellular processes. Experimental determination of protein–peptide interaction, however, is difficult and costly due to peptide flexibility and low binding affinity. Thus, making “educated” computational prediction prior to experimental studies is necessary. All existing computational techniques infer peptide binding sites from protein structures although the structures for the majority of proteins are unknown. Here the first sequence‐based method is developed and its accuracy is shown comparable to or better than existing structure‐based techniques.
ISSN:0192-8651
1096-987X
DOI:10.1002/jcc.24314