Loading…

Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach

The worldwide appearance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has generated significant concern and posed a considerable challenge to global health. Phosphorylation is a common post-translational modification that affects many vital cellular functions and is closely associ...

Full description

Saved in:
Bibliographic Details
Published in:Briefings in bioinformatics 2023-11, Vol.25 (1)
Main Authors: Pham, Nhat Truong, Phan, Le Thi, Seo, Jimin, Kim, Yeonwoo, Song, Minkyung, Lee, Sukchan, Jeon, Young-Jun, Manavalan, Balachandran
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3
cites cdi_FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3
container_end_page
container_issue 1
container_start_page
container_title Briefings in bioinformatics
container_volume 25
creator Pham, Nhat Truong
Phan, Le Thi
Seo, Jimin
Kim, Yeonwoo
Song, Minkyung
Lee, Sukchan
Jeon, Young-Jun
Manavalan, Balachandran
description The worldwide appearance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has generated significant concern and posed a considerable challenge to global health. Phosphorylation is a common post-translational modification that affects many vital cellular functions and is closely associated with SARS-CoV-2 infection. Precise identification of phosphorylation sites could provide more in-depth insight into the processes underlying SARS-CoV-2 infection and help alleviate the continuing COVID-19 crisis. Currently, available computational tools for predicting these sites lack accuracy and effectiveness. In this study, we designed an innovative meta-learning model, Meta-Learning for Serine/Threonine Phosphorylation (MeL-STPhos), to precisely identify protein phosphorylation sites. We initially performed a comprehensive assessment of 29 unique sequence-derived features, establishing prediction models for each using 14 renowned machine learning methods, ranging from traditional classifiers to advanced deep learning algorithms. We then selected the most effective model for each feature by integrating the predicted values. Rigorous feature selection strategies were employed to identify the optimal base models and classifier(s) for each cell-specific dataset. To the best of our knowledge, this is the first study to report two cell-specific models and a generic model for phosphorylation site prediction by utilizing an extensive range of sequence-derived features and machine learning algorithms. Extensive cross-validation and independent testing revealed that MeL-STPhos surpasses existing state-of-the-art tools for phosphorylation site prediction. We also developed a publicly accessible platform at https://balalab-skku.org/MeL-STPhos. We believe that MeL-STPhos will serve as a valuable tool for accelerating the discovery of serine/threonine phosphorylation sites and elucidating their role in post-translational regulation.
doi_str_mv 10.1093/bib/bbad433
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10753650</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2899374078</sourcerecordid><originalsourceid>FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3</originalsourceid><addsrcrecordid>eNpVkctLxDAQxoMovk_epUdBqknz7EmWxRcIgq-jYZKmu5FuU5Puwv73dt1V9DDMDPPxm0k-hE4IviC4pJfGm0tjoGKUbqF9wqTMGeZse1ULmXMm6B46SOkD4wJLRXbRHlWYK6LkPnofVQtorW8nWT91GVg7j2CXWaiz59HTcz4Ob3mRddOQhojLBnof2iz53mWV6539bhcespnrIW8cxHbFgq6LAez0CO3U0CR3vMmH6PXm-mV8lz883t6PRw-5parocyOMIpKKytJKCEuoULQc3kM4V7yuLcFYSVcCk4ZxYSUmqgJcqNoYU9UM6CG6WnO7uZm5yrq2j9DoLvoZxKUO4PX_SeunehIWmmDJqeB4IJxtCDF8zl3q9cwn65oGWhfmSReqLKlkwwcO0vO11MaQUnT17x6C9coSPViiN5YM6tO_p_1qfzygX5VMiYE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2899374078</pqid></control><display><type>article</type><title>Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach</title><source>Oxford Journals Open Access Collection</source><source>BSC - Ebsco (Business Source Ultimate)</source><source>PubMed Central</source><creator>Pham, Nhat Truong ; Phan, Le Thi ; Seo, Jimin ; Kim, Yeonwoo ; Song, Minkyung ; Lee, Sukchan ; Jeon, Young-Jun ; Manavalan, Balachandran</creator><creatorcontrib>Pham, Nhat Truong ; Phan, Le Thi ; Seo, Jimin ; Kim, Yeonwoo ; Song, Minkyung ; Lee, Sukchan ; Jeon, Young-Jun ; Manavalan, Balachandran</creatorcontrib><description>The worldwide appearance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has generated significant concern and posed a considerable challenge to global health. Phosphorylation is a common post-translational modification that affects many vital cellular functions and is closely associated with SARS-CoV-2 infection. Precise identification of phosphorylation sites could provide more in-depth insight into the processes underlying SARS-CoV-2 infection and help alleviate the continuing COVID-19 crisis. Currently, available computational tools for predicting these sites lack accuracy and effectiveness. In this study, we designed an innovative meta-learning model, Meta-Learning for Serine/Threonine Phosphorylation (MeL-STPhos), to precisely identify protein phosphorylation sites. We initially performed a comprehensive assessment of 29 unique sequence-derived features, establishing prediction models for each using 14 renowned machine learning methods, ranging from traditional classifiers to advanced deep learning algorithms. We then selected the most effective model for each feature by integrating the predicted values. Rigorous feature selection strategies were employed to identify the optimal base models and classifier(s) for each cell-specific dataset. To the best of our knowledge, this is the first study to report two cell-specific models and a generic model for phosphorylation site prediction by utilizing an extensive range of sequence-derived features and machine learning algorithms. Extensive cross-validation and independent testing revealed that MeL-STPhos surpasses existing state-of-the-art tools for phosphorylation site prediction. We also developed a publicly accessible platform at https://balalab-skku.org/MeL-STPhos. We believe that MeL-STPhos will serve as a valuable tool for accelerating the discovery of serine/threonine phosphorylation sites and elucidating their role in post-translational regulation.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbad433</identifier><identifier>PMID: 38058187</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>COVID-19 ; Humans ; Phosphorylation ; Problem Solving Protocol ; SARS-CoV-2 - metabolism ; Serine - metabolism ; Threonine - metabolism</subject><ispartof>Briefings in bioinformatics, 2023-11, Vol.25 (1)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><rights>The Author(s) 2023. Published by Oxford University Press. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3</citedby><cites>FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3</cites><orcidid>0000-0003-0697-9419 ; 0000-0002-8086-6722</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10753650/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10753650/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38058187$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pham, Nhat Truong</creatorcontrib><creatorcontrib>Phan, Le Thi</creatorcontrib><creatorcontrib>Seo, Jimin</creatorcontrib><creatorcontrib>Kim, Yeonwoo</creatorcontrib><creatorcontrib>Song, Minkyung</creatorcontrib><creatorcontrib>Lee, Sukchan</creatorcontrib><creatorcontrib>Jeon, Young-Jun</creatorcontrib><creatorcontrib>Manavalan, Balachandran</creatorcontrib><title>Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>The worldwide appearance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has generated significant concern and posed a considerable challenge to global health. Phosphorylation is a common post-translational modification that affects many vital cellular functions and is closely associated with SARS-CoV-2 infection. Precise identification of phosphorylation sites could provide more in-depth insight into the processes underlying SARS-CoV-2 infection and help alleviate the continuing COVID-19 crisis. Currently, available computational tools for predicting these sites lack accuracy and effectiveness. In this study, we designed an innovative meta-learning model, Meta-Learning for Serine/Threonine Phosphorylation (MeL-STPhos), to precisely identify protein phosphorylation sites. We initially performed a comprehensive assessment of 29 unique sequence-derived features, establishing prediction models for each using 14 renowned machine learning methods, ranging from traditional classifiers to advanced deep learning algorithms. We then selected the most effective model for each feature by integrating the predicted values. Rigorous feature selection strategies were employed to identify the optimal base models and classifier(s) for each cell-specific dataset. To the best of our knowledge, this is the first study to report two cell-specific models and a generic model for phosphorylation site prediction by utilizing an extensive range of sequence-derived features and machine learning algorithms. Extensive cross-validation and independent testing revealed that MeL-STPhos surpasses existing state-of-the-art tools for phosphorylation site prediction. We also developed a publicly accessible platform at https://balalab-skku.org/MeL-STPhos. We believe that MeL-STPhos will serve as a valuable tool for accelerating the discovery of serine/threonine phosphorylation sites and elucidating their role in post-translational regulation.</description><subject>COVID-19</subject><subject>Humans</subject><subject>Phosphorylation</subject><subject>Problem Solving Protocol</subject><subject>SARS-CoV-2 - metabolism</subject><subject>Serine - metabolism</subject><subject>Threonine - metabolism</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpVkctLxDAQxoMovk_epUdBqknz7EmWxRcIgq-jYZKmu5FuU5Puwv73dt1V9DDMDPPxm0k-hE4IviC4pJfGm0tjoGKUbqF9wqTMGeZse1ULmXMm6B46SOkD4wJLRXbRHlWYK6LkPnofVQtorW8nWT91GVg7j2CXWaiz59HTcz4Ob3mRddOQhojLBnof2iz53mWV6539bhcespnrIW8cxHbFgq6LAez0CO3U0CR3vMmH6PXm-mV8lz883t6PRw-5parocyOMIpKKytJKCEuoULQc3kM4V7yuLcFYSVcCk4ZxYSUmqgJcqNoYU9UM6CG6WnO7uZm5yrq2j9DoLvoZxKUO4PX_SeunehIWmmDJqeB4IJxtCDF8zl3q9cwn65oGWhfmSReqLKlkwwcO0vO11MaQUnT17x6C9coSPViiN5YM6tO_p_1qfzygX5VMiYE</recordid><startdate>20231122</startdate><enddate>20231122</enddate><creator>Pham, Nhat Truong</creator><creator>Phan, Le Thi</creator><creator>Seo, Jimin</creator><creator>Kim, Yeonwoo</creator><creator>Song, Minkyung</creator><creator>Lee, Sukchan</creator><creator>Jeon, Young-Jun</creator><creator>Manavalan, Balachandran</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-0697-9419</orcidid><orcidid>https://orcid.org/0000-0002-8086-6722</orcidid></search><sort><creationdate>20231122</creationdate><title>Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach</title><author>Pham, Nhat Truong ; Phan, Le Thi ; Seo, Jimin ; Kim, Yeonwoo ; Song, Minkyung ; Lee, Sukchan ; Jeon, Young-Jun ; Manavalan, Balachandran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>COVID-19</topic><topic>Humans</topic><topic>Phosphorylation</topic><topic>Problem Solving Protocol</topic><topic>SARS-CoV-2 - metabolism</topic><topic>Serine - metabolism</topic><topic>Threonine - metabolism</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pham, Nhat Truong</creatorcontrib><creatorcontrib>Phan, Le Thi</creatorcontrib><creatorcontrib>Seo, Jimin</creatorcontrib><creatorcontrib>Kim, Yeonwoo</creatorcontrib><creatorcontrib>Song, Minkyung</creatorcontrib><creatorcontrib>Lee, Sukchan</creatorcontrib><creatorcontrib>Jeon, Young-Jun</creatorcontrib><creatorcontrib>Manavalan, Balachandran</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pham, Nhat Truong</au><au>Phan, Le Thi</au><au>Seo, Jimin</au><au>Kim, Yeonwoo</au><au>Song, Minkyung</au><au>Lee, Sukchan</au><au>Jeon, Young-Jun</au><au>Manavalan, Balachandran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2023-11-22</date><risdate>2023</risdate><volume>25</volume><issue>1</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>The worldwide appearance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has generated significant concern and posed a considerable challenge to global health. Phosphorylation is a common post-translational modification that affects many vital cellular functions and is closely associated with SARS-CoV-2 infection. Precise identification of phosphorylation sites could provide more in-depth insight into the processes underlying SARS-CoV-2 infection and help alleviate the continuing COVID-19 crisis. Currently, available computational tools for predicting these sites lack accuracy and effectiveness. In this study, we designed an innovative meta-learning model, Meta-Learning for Serine/Threonine Phosphorylation (MeL-STPhos), to precisely identify protein phosphorylation sites. We initially performed a comprehensive assessment of 29 unique sequence-derived features, establishing prediction models for each using 14 renowned machine learning methods, ranging from traditional classifiers to advanced deep learning algorithms. We then selected the most effective model for each feature by integrating the predicted values. Rigorous feature selection strategies were employed to identify the optimal base models and classifier(s) for each cell-specific dataset. To the best of our knowledge, this is the first study to report two cell-specific models and a generic model for phosphorylation site prediction by utilizing an extensive range of sequence-derived features and machine learning algorithms. Extensive cross-validation and independent testing revealed that MeL-STPhos surpasses existing state-of-the-art tools for phosphorylation site prediction. We also developed a publicly accessible platform at https://balalab-skku.org/MeL-STPhos. We believe that MeL-STPhos will serve as a valuable tool for accelerating the discovery of serine/threonine phosphorylation sites and elucidating their role in post-translational regulation.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38058187</pmid><doi>10.1093/bib/bbad433</doi><orcidid>https://orcid.org/0000-0003-0697-9419</orcidid><orcidid>https://orcid.org/0000-0002-8086-6722</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2023-11, Vol.25 (1)
issn 1467-5463
1477-4054
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10753650
source Oxford Journals Open Access Collection; BSC - Ebsco (Business Source Ultimate); PubMed Central
subjects COVID-19
Humans
Phosphorylation
Problem Solving Protocol
SARS-CoV-2 - metabolism
Serine - metabolism
Threonine - metabolism
title Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T04%3A41%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Advancing%20the%20accuracy%20of%20SARS-CoV-2%20phosphorylation%20site%20detection%20via%20meta-learning%20approach&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Pham,%20Nhat%20Truong&rft.date=2023-11-22&rft.volume=25&rft.issue=1&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbad433&rft_dat=%3Cproquest_pubme%3E2899374078%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c382t-b6b81736dc3d66c13683943315585ffc10087e9a47b456c7018da028fbbbdf4a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2899374078&rft_id=info:pmid/38058187&rfr_iscdi=true