Loading…
Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
[Display omitted] •Provided a family of efficient network algorithms for protein complex identification.•The parameter-free family outperforms existing approaches on different networks.•It exactly recovered ~ 35% of protein complexes in a pan-plant PPI network.•We examined of network perturbations o...
Saved in:
Published in: | Computational and structural biotechnology journal 2021-01, Vol.19, p.5255-5263 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•Provided a family of efficient network algorithms for protein complex identification.•The parameter-free family outperforms existing approaches on different networks.•It exactly recovered ~ 35% of protein complexes in a pan-plant PPI network.•We examined of network perturbations on predicted protein complexes.
Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes. |
---|---|
ISSN: | 2001-0370 2001-0370 |
DOI: | 10.1016/j.csbj.2021.09.014 |