Loading…

Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis

Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other pr...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS 2017-03, Vol.114 (13), p.E2662-E2671
Main Authors: Uguzzoni, Guido, Lovis, Shalini John, Oteri, Francesco, Schug, Alexander, Szurmant, Hendrik, Weigt, Martin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3
cites cdi_FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3
container_end_page E2671
container_issue 13
container_start_page E2662
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 114
creator Uguzzoni, Guido
Lovis, Shalini John
Oteri, Francesco
Schug, Alexander
Szurmant, Hendrik
Weigt, Martin
description Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.
doi_str_mv 10.1073/pnas.1615068114
format article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5380090</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26480376</jstor_id><sourcerecordid>26480376</sourcerecordid><originalsourceid>FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3</originalsourceid><addsrcrecordid>eNpdkk1v1DAQhi0EokvhzAlkiQsc0noSf14qVRVQpJW4wNnyOk7Wq8QOdrJo_z3ebmmhJ8szz_uOZzwIvQVyAUQ0l1Mw-QI4MMIlAH2GVkAUVJwq8hytCKlFJWlNz9CrnHeEEMUkeYnOallLBUqu0O-1Sb2rsjWDw751Yfadt2b2MeDYYRvdPg7L3TX7PpghY2NTzBlv4xirOPg-ji55i6cUZ-cD9mF2qTPWZbw54NYnZ-fis0yDDz02xeKQfX6NXnTFzL25P8_Rzy-ff9zcVuvvX7_dXK8ry2gzVyAsb1wDXHRCccutArBta0yJb-TGUkMlKNsySrlRrCOSiNpSDtRKYYryHF2dfKdlM7rWlgaTGfSU_GjSQUfj9f-Z4Le6j3vNGlnGRYrBp5PB9ons9nqtjzECjHMl6z0U9uN9sRR_LS7PevTZumEwwcUla5BCsJqqWhX0wxN0F5d0nG-hJKubwjSFujxRdyNPrnt4ARB9XAB9XAD9uABF8f7ffh_4vz9egHcnYJfnmB7znErSCN78AZTGuHU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1885232933</pqid></control><display><type>article</type><title>Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis</title><source>JSTOR Archival Journals and Primary Sources Collection【Remote access available】</source><source>PubMed Central</source><creator>Uguzzoni, Guido ; Lovis, Shalini John ; Oteri, Francesco ; Schug, Alexander ; Szurmant, Hendrik ; Weigt, Martin</creator><creatorcontrib>Uguzzoni, Guido ; Lovis, Shalini John ; Oteri, Francesco ; Schug, Alexander ; Szurmant, Hendrik ; Weigt, Martin</creatorcontrib><description>Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.1615068114</identifier><identifier>PMID: 28289198</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Biological Sciences ; Catalysts ; Cells ; Information ; Life Sciences ; Nonlinear Sciences ; Physical Sciences ; PNAS Plus ; Proteins</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2017-03, Vol.114 (13), p.E2662-E2671</ispartof><rights>Volumes 1–89 and 106–114, copyright as a collective work only; author(s) retains copyright to individual articles</rights><rights>Copyright National Academy of Sciences Mar 28, 2017</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3</citedby><cites>FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3</cites><orcidid>0000-0002-0492-3684 ; 0000-0002-4284-3597</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26480376$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26480376$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53770,53772,58217,58450</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28289198$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.sorbonne-universite.fr/hal-01566982$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Uguzzoni, Guido</creatorcontrib><creatorcontrib>Lovis, Shalini John</creatorcontrib><creatorcontrib>Oteri, Francesco</creatorcontrib><creatorcontrib>Schug, Alexander</creatorcontrib><creatorcontrib>Szurmant, Hendrik</creatorcontrib><creatorcontrib>Weigt, Martin</creatorcontrib><title>Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.</description><subject>Biological Sciences</subject><subject>Catalysts</subject><subject>Cells</subject><subject>Information</subject><subject>Life Sciences</subject><subject>Nonlinear Sciences</subject><subject>Physical Sciences</subject><subject>PNAS Plus</subject><subject>Proteins</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNpdkk1v1DAQhi0EokvhzAlkiQsc0noSf14qVRVQpJW4wNnyOk7Wq8QOdrJo_z3ebmmhJ8szz_uOZzwIvQVyAUQ0l1Mw-QI4MMIlAH2GVkAUVJwq8hytCKlFJWlNz9CrnHeEEMUkeYnOallLBUqu0O-1Sb2rsjWDw751Yfadt2b2MeDYYRvdPg7L3TX7PpghY2NTzBlv4xirOPg-ji55i6cUZ-cD9mF2qTPWZbw54NYnZ-fis0yDDz02xeKQfX6NXnTFzL25P8_Rzy-ff9zcVuvvX7_dXK8ry2gzVyAsb1wDXHRCccutArBta0yJb-TGUkMlKNsySrlRrCOSiNpSDtRKYYryHF2dfKdlM7rWlgaTGfSU_GjSQUfj9f-Z4Le6j3vNGlnGRYrBp5PB9ons9nqtjzECjHMl6z0U9uN9sRR_LS7PevTZumEwwcUla5BCsJqqWhX0wxN0F5d0nG-hJKubwjSFujxRdyNPrnt4ARB9XAB9XAD9uABF8f7ffh_4vz9egHcnYJfnmB7znErSCN78AZTGuHU</recordid><startdate>20170328</startdate><enddate>20170328</enddate><creator>Uguzzoni, Guido</creator><creator>Lovis, Shalini John</creator><creator>Oteri, Francesco</creator><creator>Schug, Alexander</creator><creator>Szurmant, Hendrik</creator><creator>Weigt, Martin</creator><general>National Academy of Sciences</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>1XC</scope><scope>VOOES</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0492-3684</orcidid><orcidid>https://orcid.org/0000-0002-4284-3597</orcidid></search><sort><creationdate>20170328</creationdate><title>Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis</title><author>Uguzzoni, Guido ; Lovis, Shalini John ; Oteri, Francesco ; Schug, Alexander ; Szurmant, Hendrik ; Weigt, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Biological Sciences</topic><topic>Catalysts</topic><topic>Cells</topic><topic>Information</topic><topic>Life Sciences</topic><topic>Nonlinear Sciences</topic><topic>Physical Sciences</topic><topic>PNAS Plus</topic><topic>Proteins</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Uguzzoni, Guido</creatorcontrib><creatorcontrib>Lovis, Shalini John</creatorcontrib><creatorcontrib>Oteri, Francesco</creatorcontrib><creatorcontrib>Schug, Alexander</creatorcontrib><creatorcontrib>Szurmant, Hendrik</creatorcontrib><creatorcontrib>Weigt, Martin</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Uguzzoni, Guido</au><au>Lovis, Shalini John</au><au>Oteri, Francesco</au><au>Schug, Alexander</au><au>Szurmant, Hendrik</au><au>Weigt, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2017-03-28</date><risdate>2017</risdate><volume>114</volume><issue>13</issue><spage>E2662</spage><epage>E2671</epage><pages>E2662-E2671</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>28289198</pmid><doi>10.1073/pnas.1615068114</doi><orcidid>https://orcid.org/0000-0002-0492-3684</orcidid><orcidid>https://orcid.org/0000-0002-4284-3597</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 2017-03, Vol.114 (13), p.E2662-E2671
issn 0027-8424
1091-6490
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5380090
source JSTOR Archival Journals and Primary Sources Collection【Remote access available】; PubMed Central
subjects Biological Sciences
Catalysts
Cells
Information
Life Sciences
Nonlinear Sciences
Physical Sciences
PNAS Plus
Proteins
title Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A06%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Large-scale%20identification%20of%20coevolution%20signals%20across%20homo-oligomeric%20protein%20interfaces%20by%20direct%20coupling%20analysis&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Uguzzoni,%20Guido&rft.date=2017-03-28&rft.volume=114&rft.issue=13&rft.spage=E2662&rft.epage=E2671&rft.pages=E2662-E2671&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.1615068114&rft_dat=%3Cjstor_pubme%3E26480376%3C/jstor_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c543t-17c63e3167f796c6c911cddaa7c6b8bc4a4819cd5446a95f08072c4614c87a3e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1885232933&rft_id=info:pmid/28289198&rft_jstor_id=26480376&rfr_iscdi=true