Loading…

Yield prediction through integration of genetic, environment, and management data through deep learning

Abstract Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new method...

Full description

Saved in:
Bibliographic Details
Published in:G3 : genes - genomes - genetics 2023-04, Vol.13 (4)
Main Authors: Kick, Daniel R, Wallace, Jason G, Schnable, James C, Kolkman, Judith M, Alaca, Barış, Beissinger, Timothy M, Edwards, Jode, Ertl, David, Flint-Garcia, Sherry, Gage, Joseph L, Hirsch, Candice N, Knoll, Joseph E, de Leon, Natalia, Lima, Dayane C, Moreta, Danilo E, Singh, Maninder P, Thompson, Addie, Weldekidan, Teclemariam, Washburn, Jacob D
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new methods applied toward this goal. Here we predict maize yield using deep neural networks, compare the efficacy of 2 model development methods, and contextualize model performance using conventional linear and machine learning models. We examine the usefulness of incorporating interactions between disparate data types. We find deep learning and best linear unbiased predictor (BLUP) models with interactions had the best overall performance. BLUP models achieved the lowest average error, but deep learning models performed more consistently with similar average error. Optimizing deep neural network submodules for each data type improved model performance relative to optimizing the whole model for all data types at once. Examining the effect of interactions in the best-performing model revealed that including interactions altered the model's sensitivity to weather and management features, including a reduction of the importance scores for timepoints expected to have a limited physiological basis for influencing yield—those at the extreme end of the season, nearly 200 days post planting. Based on these results, deep learning provides a promising avenue for the phenotypic prediction of complex traits in complex environments and a potential mechanism to better understand the influence of environmental and genetic factors.
ISSN:2160-1836
2160-1836
DOI:10.1093/g3journal/jkad006