Loading…

A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands

Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, whic...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2021-04, Vol.16 (4), p.e0244905-e0244905
Main Authors: Premarathna, Galkande Iresha, Ellingson, Leif
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c641t-12a72fe42aaeb85b6ee314da8d6b7088070dded7fef50e20aa5dce70eadad27d3
container_end_page e0244905
container_issue 4
container_start_page e0244905
container_title PloS one
container_volume 16
creator Premarathna, Galkande Iresha
Ellingson, Leif
description Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, which can result in a great deal of information loss since the structures are three-dimensional. Additionally, a common way of comparing binding sites is via aligning their atoms, which is a computationally intensive procedure that substantially limits the types of analysis and modeling that can be done. In this work, we develop a novel encoding of binding sites as covariance matrices of the distances of atoms to the principal axes of the structures. This representation is invariant to the chosen coordinate system for the atoms in the binding sites, which removes the need to align the sites to a common coordinate system, is computationally efficient, and permits the development of probability models. These can then be used to both better understand groups of binding sites that bind to the same ligand and perform classification for these ligand groups. We demonstrate the utility of our method for discrimination of binding ligand through classification studies with two benchmark datasets using nearest mean and polytomous logistic regression classifiers.
doi_str_mv 10.1371/journal.pone.0244905
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2510234638</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A657819715</galeid><doaj_id>oai_doaj_org_article_531da04bb8cc4c4fbf1e53f338f157fc</doaj_id><sourcerecordid>A657819715</sourcerecordid><originalsourceid>FETCH-LOGICAL-c641t-12a72fe42aaeb85b6ee314da8d6b7088070dded7fef50e20aa5dce70eadad27d3</originalsourceid><addsrcrecordid>eNqNk12L1DAUhoso7rr6D0QLgujFjEnTr7kRlsWPgYUFv27DaXIyk6VNxiSV9Yf4fz0z21lmZC-k0Kbp874neZuTZc85m3PR8HfXfgwO-vnGO5yzoiwXrHqQnfKFKGZ1wcTDg_FJ9iTGa8Yq0db14-xEiFZwVrDT7M95PkBaI92sgj4PuAkY0SV69y73Jt8En9C6vLNOW7fKo00Y8zHuximMKo2BhNrGDYY4iSD5IeYm-IH01im7IQRuSGh8yFUPMVpDBfdF9ua9XYHT8Wn2yEAf8dn0PMu-f_zw7eLz7PLq0_Li_HKm6pKnGS-gKQyWBQB2bdXViIKXGlpddw1rW9YwrVE3Bk3FsGAAlVbYMAQNumi0OMte3vpueh_llGiURUXhiLKmlM6y5S2hPVxL2ssA4bf0YOVuwoeVhEDR9SgrwTWwsutapUpVms5wrIShrA2vGqPI6_1UbewGpJW4RMkdmR5_cXYtV_6XbBn9rZaTwZvJIPifI8YkBxsV9j049ONu3bwoWbXYrvvVP-j9u5uoFdAGrDOe6qqtqTyvq6bli4ZXRM3voejSOFhFx89Ymj8SvD0SEJPwJq1gjFEuv375f_bqxzH7-oBdI_RpHX0_bk9RPAbLW1AFH2NAcxcyZ3LbPfs05LZ75NQ9JHtx-IPuRPt2EX8Bt44aMw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2510234638</pqid></control><display><type>article</type><title>A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands</title><source>Open Access: PubMed Central</source><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Premarathna, Galkande Iresha ; Ellingson, Leif</creator><contributor>Krishnan, Viswanathan V</contributor><creatorcontrib>Premarathna, Galkande Iresha ; Ellingson, Leif ; Krishnan, Viswanathan V</creatorcontrib><description>Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, which can result in a great deal of information loss since the structures are three-dimensional. Additionally, a common way of comparing binding sites is via aligning their atoms, which is a computationally intensive procedure that substantially limits the types of analysis and modeling that can be done. In this work, we develop a novel encoding of binding sites as covariance matrices of the distances of atoms to the principal axes of the structures. This representation is invariant to the chosen coordinate system for the atoms in the binding sites, which removes the need to align the sites to a common coordinate system, is computationally efficient, and permits the development of probability models. These can then be used to both better understand groups of binding sites that bind to the same ligand and perform classification for these ligand groups. We demonstrate the utility of our method for discrimination of binding ligand through classification studies with two benchmark datasets using nearest mean and polytomous logistic regression classifiers.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0244905</identifier><identifier>PMID: 33831020</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Amino acids ; Binding sites ; Bioinformatics ; Biology and Life Sciences ; Computer and Information Sciences ; Genetic aspects ; Identification and classification ; Ligands ; Ligands (Biochemistry) ; Machine learning ; NMR ; Nuclear magnetic resonance ; Performance evaluation ; Physical Sciences ; Protein binding ; Proteins ; Research and Analysis Methods ; Researchers ; Statistical analysis</subject><ispartof>PloS one, 2021-04, Vol.16 (4), p.e0244905-e0244905</ispartof><rights>COPYRIGHT 2021 Public Library of Science</rights><rights>2021 Premarathna, Ellingson. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 Premarathna, Ellingson 2021 Premarathna, Ellingson</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c641t-12a72fe42aaeb85b6ee314da8d6b7088070dded7fef50e20aa5dce70eadad27d3</cites><orcidid>0000-0002-4425-5369</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2510234638/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2510234638?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25752,27923,27924,37011,37012,44589,53790,53792,74897</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33831020$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Krishnan, Viswanathan V</contributor><creatorcontrib>Premarathna, Galkande Iresha</creatorcontrib><creatorcontrib>Ellingson, Leif</creatorcontrib><title>A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, which can result in a great deal of information loss since the structures are three-dimensional. Additionally, a common way of comparing binding sites is via aligning their atoms, which is a computationally intensive procedure that substantially limits the types of analysis and modeling that can be done. In this work, we develop a novel encoding of binding sites as covariance matrices of the distances of atoms to the principal axes of the structures. This representation is invariant to the chosen coordinate system for the atoms in the binding sites, which removes the need to align the sites to a common coordinate system, is computationally efficient, and permits the development of probability models. These can then be used to both better understand groups of binding sites that bind to the same ligand and perform classification for these ligand groups. We demonstrate the utility of our method for discrimination of binding ligand through classification studies with two benchmark datasets using nearest mean and polytomous logistic regression classifiers.</description><subject>Amino acids</subject><subject>Binding sites</subject><subject>Bioinformatics</subject><subject>Biology and Life Sciences</subject><subject>Computer and Information Sciences</subject><subject>Genetic aspects</subject><subject>Identification and classification</subject><subject>Ligands</subject><subject>Ligands (Biochemistry)</subject><subject>Machine learning</subject><subject>NMR</subject><subject>Nuclear magnetic resonance</subject><subject>Performance evaluation</subject><subject>Physical Sciences</subject><subject>Protein binding</subject><subject>Proteins</subject><subject>Research and Analysis Methods</subject><subject>Researchers</subject><subject>Statistical analysis</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNk12L1DAUhoso7rr6D0QLgujFjEnTr7kRlsWPgYUFv27DaXIyk6VNxiSV9Yf4fz0z21lmZC-k0Kbp874neZuTZc85m3PR8HfXfgwO-vnGO5yzoiwXrHqQnfKFKGZ1wcTDg_FJ9iTGa8Yq0db14-xEiFZwVrDT7M95PkBaI92sgj4PuAkY0SV69y73Jt8En9C6vLNOW7fKo00Y8zHuximMKo2BhNrGDYY4iSD5IeYm-IH01im7IQRuSGh8yFUPMVpDBfdF9ua9XYHT8Wn2yEAf8dn0PMu-f_zw7eLz7PLq0_Li_HKm6pKnGS-gKQyWBQB2bdXViIKXGlpddw1rW9YwrVE3Bk3FsGAAlVbYMAQNumi0OMte3vpueh_llGiURUXhiLKmlM6y5S2hPVxL2ssA4bf0YOVuwoeVhEDR9SgrwTWwsutapUpVms5wrIShrA2vGqPI6_1UbewGpJW4RMkdmR5_cXYtV_6XbBn9rZaTwZvJIPifI8YkBxsV9j049ONu3bwoWbXYrvvVP-j9u5uoFdAGrDOe6qqtqTyvq6bli4ZXRM3voejSOFhFx89Ymj8SvD0SEJPwJq1gjFEuv375f_bqxzH7-oBdI_RpHX0_bk9RPAbLW1AFH2NAcxcyZ3LbPfs05LZ75NQ9JHtx-IPuRPt2EX8Bt44aMw</recordid><startdate>20210408</startdate><enddate>20210408</enddate><creator>Premarathna, Galkande Iresha</creator><creator>Ellingson, Leif</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-4425-5369</orcidid></search><sort><creationdate>20210408</creationdate><title>A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands</title><author>Premarathna, Galkande Iresha ; Ellingson, Leif</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c641t-12a72fe42aaeb85b6ee314da8d6b7088070dded7fef50e20aa5dce70eadad27d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Amino acids</topic><topic>Binding sites</topic><topic>Bioinformatics</topic><topic>Biology and Life Sciences</topic><topic>Computer and Information Sciences</topic><topic>Genetic aspects</topic><topic>Identification and classification</topic><topic>Ligands</topic><topic>Ligands (Biochemistry)</topic><topic>Machine learning</topic><topic>NMR</topic><topic>Nuclear magnetic resonance</topic><topic>Performance evaluation</topic><topic>Physical Sciences</topic><topic>Protein binding</topic><topic>Proteins</topic><topic>Research and Analysis Methods</topic><topic>Researchers</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Premarathna, Galkande Iresha</creatorcontrib><creatorcontrib>Ellingson, Leif</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Opposing Viewpoints Resource Center</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing &amp; Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>ProQuest_Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>ProQuest Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Agricultural &amp; Environmental Science</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing &amp; Allied Health Database (Alumni Edition)</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>Biological Sciences</collection><collection>Agriculture Science Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials Science Collection</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals (Open Access)</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Premarathna, Galkande Iresha</au><au>Ellingson, Leif</au><au>Krishnan, Viswanathan V</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2021-04-08</date><risdate>2021</risdate><volume>16</volume><issue>4</issue><spage>e0244905</spage><epage>e0244905</epage><pages>e0244905-e0244905</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, which can result in a great deal of information loss since the structures are three-dimensional. Additionally, a common way of comparing binding sites is via aligning their atoms, which is a computationally intensive procedure that substantially limits the types of analysis and modeling that can be done. In this work, we develop a novel encoding of binding sites as covariance matrices of the distances of atoms to the principal axes of the structures. This representation is invariant to the chosen coordinate system for the atoms in the binding sites, which removes the need to align the sites to a common coordinate system, is computationally efficient, and permits the development of probability models. These can then be used to both better understand groups of binding sites that bind to the same ligand and perform classification for these ligand groups. We demonstrate the utility of our method for discrimination of binding ligand through classification studies with two benchmark datasets using nearest mean and polytomous logistic regression classifiers.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>33831020</pmid><doi>10.1371/journal.pone.0244905</doi><tpages>e0244905</tpages><orcidid>https://orcid.org/0000-0002-4425-5369</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6203
ispartof PloS one, 2021-04, Vol.16 (4), p.e0244905-e0244905
issn 1932-6203
1932-6203
language eng
recordid cdi_plos_journals_2510234638
source Open Access: PubMed Central; Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects Amino acids
Binding sites
Bioinformatics
Biology and Life Sciences
Computer and Information Sciences
Genetic aspects
Identification and classification
Ligands
Ligands (Biochemistry)
Machine learning
NMR
Nuclear magnetic resonance
Performance evaluation
Physical Sciences
Protein binding
Proteins
Research and Analysis Methods
Researchers
Statistical analysis
title A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T07%3A53%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20mathematical%20representation%20of%20protein%20binding%20sites%20using%20structural%20dispersion%20of%20atoms%20from%20principal%20axes%20for%20classification%20of%20binding%20ligands&rft.jtitle=PloS%20one&rft.au=Premarathna,%20Galkande%20Iresha&rft.date=2021-04-08&rft.volume=16&rft.issue=4&rft.spage=e0244905&rft.epage=e0244905&rft.pages=e0244905-e0244905&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0244905&rft_dat=%3Cgale_plos_%3EA657819715%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c641t-12a72fe42aaeb85b6ee314da8d6b7088070dded7fef50e20aa5dce70eadad27d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2510234638&rft_id=info:pmid/33831020&rft_galeid=A657819715&rfr_iscdi=true