Loading…

A systematic analysis of regression models for protein engineering

To optimize proteins for particular traits holds great promise for industrial and pharmaceutical purposes. Machine Learning is increasingly applied in this field to predict properties of proteins, thereby guiding the experimental optimization process. A natural question is: How much progress are we...

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology 2024-05, Vol.20 (5), p.e1012061-e1012061
Main Authors: Michael, Richard, Kæstel-Hansen, Jacob, Mørch Groth, Peter, Bartels, Simon, Salomon, Jesper, Tian, Pengfei, Hatzakis, Nikos S, Boomsma, Wouter
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To optimize proteins for particular traits holds great promise for industrial and pharmaceutical purposes. Machine Learning is increasingly applied in this field to predict properties of proteins, thereby guiding the experimental optimization process. A natural question is: How much progress are we making with such predictions, and how important is the choice of regressor and representation? In this paper, we demonstrate that different assessment criteria for regressor performance can lead to dramatically different conclusions, depending on the choice of metric, and how one defines generalization. We highlight the fundamental issues of sample bias in typical regression scenarios and how this can lead to misleading conclusions about regressor performance. Finally, we make the case for the importance of calibrated uncertainty in this domain.
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1012061