Loading…
Prediction of protein–protein interactions based on elastic net and deep forest
•A novel method (GcForest-PPI) to predict protein–protein interactions.•The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information.•The elastic net is employed to eliminate redundant and irrelevant features.•We firstly use deep forest as classifier to predict PPIs via l...
Saved in:
Published in: | Expert systems with applications 2021-08, Vol.176, p.114876, Article 114876 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833 |
---|---|
cites | cdi_FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833 |
container_end_page | |
container_issue | |
container_start_page | 114876 |
container_title | Expert systems with applications |
container_volume | 176 |
creator | Yu, Bin Chen, Cheng Wang, Xiaolin Yu, Zhaomin Ma, Anjun Liu, Bingqiang |
description | •A novel method (GcForest-PPI) to predict protein–protein interactions.•The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information.•The elastic net is employed to eliminate redundant and irrelevant features.•We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features.•GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network.
Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on Saccharomyces cerevisiae and Helicobacter pylori. We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery. |
doi_str_mv | 10.1016/j.eswa.2021.114876 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2543515136</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417421003171</els_id><sourcerecordid>2543515136</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833</originalsourceid><addsrcrecordid>eNp9kE1KBDEQhYMoOI5ewFXAdbf57aTBjQz-wYAKug6ZdAXSjN1jklHceQdv6EnM2K5dVUG9V-_xIXRKSU0Jbc77GtK7rRlhtKZUaNXsoRnVileNavk-mpFWqkpQJQ7RUUo9IVQRombo8SFCF1wO44BHjzdxzBCG78-vvw2HIUO0v4KEVzZBh4sU1jbl4PAAGduhwx3ABvsxQsrH6MDbdYKTvzlHz9dXT4vbanl_c7e4XFZOcJ0rTxomJOeWdVI7r7R2beso116Klq6YBuV1yzrBWyrKlXAJ3DnFoFWaaM7n6Gz6W5q-bkuw6cdtHEqkYVJwSSXlTVGxSeXimFIEbzYxvNj4YSgxO3SmNzt0ZofOTOiK6WIyQen_FiCa5AIMrpCK4LLpxvCf_QctIndX</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2543515136</pqid></control><display><type>article</type><title>Prediction of protein–protein interactions based on elastic net and deep forest</title><source>ScienceDirect Freedom Collection</source><creator>Yu, Bin ; Chen, Cheng ; Wang, Xiaolin ; Yu, Zhaomin ; Ma, Anjun ; Liu, Bingqiang</creator><creatorcontrib>Yu, Bin ; Chen, Cheng ; Wang, Xiaolin ; Yu, Zhaomin ; Ma, Anjun ; Liu, Bingqiang</creatorcontrib><description>•A novel method (GcForest-PPI) to predict protein–protein interactions.•The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information.•The elastic net is employed to eliminate redundant and irrelevant features.•We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features.•GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network.
Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on Saccharomyces cerevisiae and Helicobacter pylori. We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2021.114876</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Amino acids ; Composition ; Decision trees ; Deep forest ; Elastic net ; Experiments ; Machine learning ; Multi-information fusion ; Performance prediction ; Protein-protein interactions ; Proteins ; R&D ; Research & development</subject><ispartof>Expert systems with applications, 2021-08, Vol.176, p.114876, Article 114876</ispartof><rights>2021 Elsevier Ltd</rights><rights>Copyright Elsevier BV Aug 15, 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833</citedby><cites>FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833</cites><orcidid>0000-0002-5734-1135 ; 0000-0001-8785-8058 ; 0000-0002-7310-7963 ; 0000-0002-4354-5508 ; 0000-0002-2453-7852 ; 0000-0001-6269-398X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Yu, Bin</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Wang, Xiaolin</creatorcontrib><creatorcontrib>Yu, Zhaomin</creatorcontrib><creatorcontrib>Ma, Anjun</creatorcontrib><creatorcontrib>Liu, Bingqiang</creatorcontrib><title>Prediction of protein–protein interactions based on elastic net and deep forest</title><title>Expert systems with applications</title><description>•A novel method (GcForest-PPI) to predict protein–protein interactions.•The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information.•The elastic net is employed to eliminate redundant and irrelevant features.•We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features.•GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network.
Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on Saccharomyces cerevisiae and Helicobacter pylori. We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery.</description><subject>Amino acids</subject><subject>Composition</subject><subject>Decision trees</subject><subject>Deep forest</subject><subject>Elastic net</subject><subject>Experiments</subject><subject>Machine learning</subject><subject>Multi-information fusion</subject><subject>Performance prediction</subject><subject>Protein-protein interactions</subject><subject>Proteins</subject><subject>R&D</subject><subject>Research & development</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1KBDEQhYMoOI5ewFXAdbf57aTBjQz-wYAKug6ZdAXSjN1jklHceQdv6EnM2K5dVUG9V-_xIXRKSU0Jbc77GtK7rRlhtKZUaNXsoRnVileNavk-mpFWqkpQJQ7RUUo9IVQRombo8SFCF1wO44BHjzdxzBCG78-vvw2HIUO0v4KEVzZBh4sU1jbl4PAAGduhwx3ABvsxQsrH6MDbdYKTvzlHz9dXT4vbanl_c7e4XFZOcJ0rTxomJOeWdVI7r7R2beso116Klq6YBuV1yzrBWyrKlXAJ3DnFoFWaaM7n6Gz6W5q-bkuw6cdtHEqkYVJwSSXlTVGxSeXimFIEbzYxvNj4YSgxO3SmNzt0ZofOTOiK6WIyQen_FiCa5AIMrpCK4LLpxvCf_QctIndX</recordid><startdate>20210815</startdate><enddate>20210815</enddate><creator>Yu, Bin</creator><creator>Chen, Cheng</creator><creator>Wang, Xiaolin</creator><creator>Yu, Zhaomin</creator><creator>Ma, Anjun</creator><creator>Liu, Bingqiang</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5734-1135</orcidid><orcidid>https://orcid.org/0000-0001-8785-8058</orcidid><orcidid>https://orcid.org/0000-0002-7310-7963</orcidid><orcidid>https://orcid.org/0000-0002-4354-5508</orcidid><orcidid>https://orcid.org/0000-0002-2453-7852</orcidid><orcidid>https://orcid.org/0000-0001-6269-398X</orcidid></search><sort><creationdate>20210815</creationdate><title>Prediction of protein–protein interactions based on elastic net and deep forest</title><author>Yu, Bin ; Chen, Cheng ; Wang, Xiaolin ; Yu, Zhaomin ; Ma, Anjun ; Liu, Bingqiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Amino acids</topic><topic>Composition</topic><topic>Decision trees</topic><topic>Deep forest</topic><topic>Elastic net</topic><topic>Experiments</topic><topic>Machine learning</topic><topic>Multi-information fusion</topic><topic>Performance prediction</topic><topic>Protein-protein interactions</topic><topic>Proteins</topic><topic>R&D</topic><topic>Research & development</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yu, Bin</creatorcontrib><creatorcontrib>Chen, Cheng</creatorcontrib><creatorcontrib>Wang, Xiaolin</creatorcontrib><creatorcontrib>Yu, Zhaomin</creatorcontrib><creatorcontrib>Ma, Anjun</creatorcontrib><creatorcontrib>Liu, Bingqiang</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yu, Bin</au><au>Chen, Cheng</au><au>Wang, Xiaolin</au><au>Yu, Zhaomin</au><au>Ma, Anjun</au><au>Liu, Bingqiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prediction of protein–protein interactions based on elastic net and deep forest</atitle><jtitle>Expert systems with applications</jtitle><date>2021-08-15</date><risdate>2021</risdate><volume>176</volume><spage>114876</spage><pages>114876-</pages><artnum>114876</artnum><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•A novel method (GcForest-PPI) to predict protein–protein interactions.•The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information.•The elastic net is employed to eliminate redundant and irrelevant features.•We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features.•GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network.
Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on Saccharomyces cerevisiae and Helicobacter pylori. We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2021.114876</doi><orcidid>https://orcid.org/0000-0002-5734-1135</orcidid><orcidid>https://orcid.org/0000-0001-8785-8058</orcidid><orcidid>https://orcid.org/0000-0002-7310-7963</orcidid><orcidid>https://orcid.org/0000-0002-4354-5508</orcidid><orcidid>https://orcid.org/0000-0002-2453-7852</orcidid><orcidid>https://orcid.org/0000-0001-6269-398X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0957-4174 |
ispartof | Expert systems with applications, 2021-08, Vol.176, p.114876, Article 114876 |
issn | 0957-4174 1873-6793 |
language | eng |
recordid | cdi_proquest_journals_2543515136 |
source | ScienceDirect Freedom Collection |
subjects | Amino acids Composition Decision trees Deep forest Elastic net Experiments Machine learning Multi-information fusion Performance prediction Protein-protein interactions Proteins R&D Research & development |
title | Prediction of protein–protein interactions based on elastic net and deep forest |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T17%3A15%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prediction%20of%20protein%E2%80%93protein%20interactions%20based%20on%20elastic%20net%20and%20deep%20forest&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Yu,%20Bin&rft.date=2021-08-15&rft.volume=176&rft.spage=114876&rft.pages=114876-&rft.artnum=114876&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2021.114876&rft_dat=%3Cproquest_cross%3E2543515136%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c438t-f0624533a2d58cf788c99c138f5491b28e7f892d43914788035e3cc72e9780833%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2543515136&rft_id=info:pmid/&rfr_iscdi=true |