Loading…

Signal Peptides Generated by Attention-Based Neural Networks

Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. E...

Full description

Saved in:
Bibliographic Details
Published in:ACS synthetic biology 2020-08, Vol.9 (8), p.2154-2161
Main Authors: Wu, Zachary, Yang, Kevin K, Liszka, Michael J, Lee, Alycia, Batzilla, Alina, Wernick, David, Weiner, David P, Arnold, Frances H
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3
cites cdi_FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3
container_end_page 2161
container_issue 8
container_start_page 2154
container_title ACS synthetic biology
container_volume 9
creator Wu, Zachary
Yang, Kevin K
Liszka, Michael J
Lee, Alycia
Batzilla, Alina
Wernick, David
Weiner, David P
Arnold, Frances H
description Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. Experimental testing demonstrates that the model-generated SPs are functional: when appended to enzymes expressed in an industrial Bacillus subtilis strain, the SPs lead to secreted activity that is competitive with industrially used SPs. Additionally, the model-generated SPs are diverse in sequence, sharing as little as 58% sequence identity to the closest known native signal peptide and 73% ± 9% on average.
doi_str_mv 10.1021/acssynbio.0c00219
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2423066286</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2423066286</sourcerecordid><originalsourceid>FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3</originalsourceid><addsrcrecordid>eNp9kMFKAzEQhoMoWGofwNsevWzNJJukC15q0SqUKqjnkGZnZet2U5Mssm9vpEU8OZcZ_vlmDh8hl0CnQBlcGxvC0G0aN6WWpqA8ISMGEnJBJT_9M5-TSQhbmkoILvhsRG5emvfOtNkz7mNTYciW2KE3EatsM2TzGLGLjevyWxNStMbeJ3iN8cv5j3BBzmrTBpwc-5i83d-9Lh7y1dPycTFf5YaLMuYoFNC6LqVRZWkks4pbUBwYzIyVG1GxCoQwgpYFFKrCuuaGKiuxUGA4Wj4mV4e_e-8-ewxR75pgsW1Nh64PmhWMUynZTCYUDqj1LgSPtd77Zmf8oIHqH1n6V5Y-yko3-eEmrfTW9T4JCf_w31ESbb4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2423066286</pqid></control><display><type>article</type><title>Signal Peptides Generated by Attention-Based Neural Networks</title><source>American Chemical Society:Jisc Collections:American Chemical Society Read &amp; Publish Agreement 2022-2024 (Reading list)</source><creator>Wu, Zachary ; Yang, Kevin K ; Liszka, Michael J ; Lee, Alycia ; Batzilla, Alina ; Wernick, David ; Weiner, David P ; Arnold, Frances H</creator><creatorcontrib>Wu, Zachary ; Yang, Kevin K ; Liszka, Michael J ; Lee, Alycia ; Batzilla, Alina ; Wernick, David ; Weiner, David P ; Arnold, Frances H</creatorcontrib><description>Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. Experimental testing demonstrates that the model-generated SPs are functional: when appended to enzymes expressed in an industrial Bacillus subtilis strain, the SPs lead to secreted activity that is competitive with industrially used SPs. Additionally, the model-generated SPs are diverse in sequence, sharing as little as 58% sequence identity to the closest known native signal peptide and 73% ± 9% on average.</description><identifier>ISSN: 2161-5063</identifier><identifier>EISSN: 2161-5063</identifier><identifier>DOI: 10.1021/acssynbio.0c00219</identifier><language>eng</language><publisher>American Chemical Society</publisher><ispartof>ACS synthetic biology, 2020-08, Vol.9 (8), p.2154-2161</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3</citedby><cites>FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3</cites><orcidid>0000-0002-4027-364X ; 0000-0003-2429-9812</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Wu, Zachary</creatorcontrib><creatorcontrib>Yang, Kevin K</creatorcontrib><creatorcontrib>Liszka, Michael J</creatorcontrib><creatorcontrib>Lee, Alycia</creatorcontrib><creatorcontrib>Batzilla, Alina</creatorcontrib><creatorcontrib>Wernick, David</creatorcontrib><creatorcontrib>Weiner, David P</creatorcontrib><creatorcontrib>Arnold, Frances H</creatorcontrib><title>Signal Peptides Generated by Attention-Based Neural Networks</title><title>ACS synthetic biology</title><addtitle>ACS Synth. Biol</addtitle><description>Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. Experimental testing demonstrates that the model-generated SPs are functional: when appended to enzymes expressed in an industrial Bacillus subtilis strain, the SPs lead to secreted activity that is competitive with industrially used SPs. Additionally, the model-generated SPs are diverse in sequence, sharing as little as 58% sequence identity to the closest known native signal peptide and 73% ± 9% on average.</description><issn>2161-5063</issn><issn>2161-5063</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp9kMFKAzEQhoMoWGofwNsevWzNJJukC15q0SqUKqjnkGZnZet2U5Mssm9vpEU8OZcZ_vlmDh8hl0CnQBlcGxvC0G0aN6WWpqA8ISMGEnJBJT_9M5-TSQhbmkoILvhsRG5emvfOtNkz7mNTYciW2KE3EatsM2TzGLGLjevyWxNStMbeJ3iN8cv5j3BBzmrTBpwc-5i83d-9Lh7y1dPycTFf5YaLMuYoFNC6LqVRZWkks4pbUBwYzIyVG1GxCoQwgpYFFKrCuuaGKiuxUGA4Wj4mV4e_e-8-ewxR75pgsW1Nh64PmhWMUynZTCYUDqj1LgSPtd77Zmf8oIHqH1n6V5Y-yko3-eEmrfTW9T4JCf_w31ESbb4</recordid><startdate>20200821</startdate><enddate>20200821</enddate><creator>Wu, Zachary</creator><creator>Yang, Kevin K</creator><creator>Liszka, Michael J</creator><creator>Lee, Alycia</creator><creator>Batzilla, Alina</creator><creator>Wernick, David</creator><creator>Weiner, David P</creator><creator>Arnold, Frances H</creator><general>American Chemical Society</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4027-364X</orcidid><orcidid>https://orcid.org/0000-0003-2429-9812</orcidid></search><sort><creationdate>20200821</creationdate><title>Signal Peptides Generated by Attention-Based Neural Networks</title><author>Wu, Zachary ; Yang, Kevin K ; Liszka, Michael J ; Lee, Alycia ; Batzilla, Alina ; Wernick, David ; Weiner, David P ; Arnold, Frances H</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Zachary</creatorcontrib><creatorcontrib>Yang, Kevin K</creatorcontrib><creatorcontrib>Liszka, Michael J</creatorcontrib><creatorcontrib>Lee, Alycia</creatorcontrib><creatorcontrib>Batzilla, Alina</creatorcontrib><creatorcontrib>Wernick, David</creatorcontrib><creatorcontrib>Weiner, David P</creatorcontrib><creatorcontrib>Arnold, Frances H</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>ACS synthetic biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Zachary</au><au>Yang, Kevin K</au><au>Liszka, Michael J</au><au>Lee, Alycia</au><au>Batzilla, Alina</au><au>Wernick, David</au><au>Weiner, David P</au><au>Arnold, Frances H</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Signal Peptides Generated by Attention-Based Neural Networks</atitle><jtitle>ACS synthetic biology</jtitle><addtitle>ACS Synth. Biol</addtitle><date>2020-08-21</date><risdate>2020</risdate><volume>9</volume><issue>8</issue><spage>2154</spage><epage>2161</epage><pages>2154-2161</pages><issn>2161-5063</issn><eissn>2161-5063</eissn><abstract>Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. Experimental testing demonstrates that the model-generated SPs are functional: when appended to enzymes expressed in an industrial Bacillus subtilis strain, the SPs lead to secreted activity that is competitive with industrially used SPs. Additionally, the model-generated SPs are diverse in sequence, sharing as little as 58% sequence identity to the closest known native signal peptide and 73% ± 9% on average.</abstract><pub>American Chemical Society</pub><doi>10.1021/acssynbio.0c00219</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-4027-364X</orcidid><orcidid>https://orcid.org/0000-0003-2429-9812</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2161-5063
ispartof ACS synthetic biology, 2020-08, Vol.9 (8), p.2154-2161
issn 2161-5063
2161-5063
language eng
recordid cdi_proquest_miscellaneous_2423066286
source American Chemical Society:Jisc Collections:American Chemical Society Read & Publish Agreement 2022-2024 (Reading list)
title Signal Peptides Generated by Attention-Based Neural Networks
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T05%3A17%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Signal%20Peptides%20Generated%20by%20Attention-Based%20Neural%20Networks&rft.jtitle=ACS%20synthetic%20biology&rft.au=Wu,%20Zachary&rft.date=2020-08-21&rft.volume=9&rft.issue=8&rft.spage=2154&rft.epage=2161&rft.pages=2154-2161&rft.issn=2161-5063&rft.eissn=2161-5063&rft_id=info:doi/10.1021/acssynbio.0c00219&rft_dat=%3Cproquest_cross%3E2423066286%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a359t-e5710ff96a799a62c73c1731218ac6b5d2d155a5094147deff3a07c6e471a3ec3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2423066286&rft_id=info:pmid/&rfr_iscdi=true