Loading…

Deep learning for optimization of protein expression

Advances in high-throughput DNA synthesis and sequencing have fuelled the use of massively parallel reporter assays for strain characterization. These experiments produce large datasets that map DNA sequences to protein expression levels, and have sparked increased interest in data-driven methods fo...

Full description

Saved in:
Bibliographic Details
Published in:Current opinion in biotechnology 2023-06, Vol.81, p.102941-102941, Article 102941
Main Authors: Nikolados, Evangelos-Marios, Oyarzún, Diego A
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Advances in high-throughput DNA synthesis and sequencing have fuelled the use of massively parallel reporter assays for strain characterization. These experiments produce large datasets that map DNA sequences to protein expression levels, and have sparked increased interest in data-driven methods for sequence-to-expression modeling. Here, we highlight progress in deep learning models of protein expression and their potential for optimizing strains engineered to produce recombinant proteins. We discuss recent works that built highly accurate models as well as the challenges that hinder wider adoption by end users. There is a need to better align this technology with the requirements and capabilities encountered in strain engineering, particularly the cost of data acquisition and the need for interpretable models that generalize beyond the training data. Overcoming these barriers will help to incentivize academic and industrial laboratories to tap into a new era of data-centric strain engineering. •Deep learning produces highly accurate predictors of protein expression.•Recent works demonstrate substantial potential in strain design and optimization.•Key challenges are the cost of data acquisition, poor interpretability, and the inability of models to extrapolate predictions beyond the training data.•There is a need to align models with the requirements and capabilities of end users in microbial engineering.
ISSN:0958-1669
1879-0429
DOI:10.1016/j.copbio.2023.102941