Loading…

Estimation of soybean yield based on high-throughput phenotyping and machine learning

Soybeans are an important crop used for food, oil, and feed. However, China's soybean self-sufficiency is highly inadequate, with an annual import volume exceeding 80%. RGB cameras serve as powerful tools for estimating crop yield, and machine learning is a practical method based on various fea...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in plant science 2024-06, Vol.15, p.1395760
Main Authors: Li, Xiuni, Chen, Menggen, He, Shuyuan, Xu, Xiangyao, He, Lingxiao, Wang, Li, Gao, Yang, Tang, Fenda, Gong, Tao, Wang, Wenyan, Xu, Mei, Liu, Chunyan, Yu, Liang, Liu, Weiguo, Yang, Wenyu
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Soybeans are an important crop used for food, oil, and feed. However, China's soybean self-sufficiency is highly inadequate, with an annual import volume exceeding 80%. RGB cameras serve as powerful tools for estimating crop yield, and machine learning is a practical method based on various features, providing improved yield predictions. However, selecting different input parameters and models, specifically optimal features and model effects, significantly influences soybean yield prediction. This study used an RGB camera to capture soybean canopy images from both the side and top perspectives during the R6 stage (pod filling stage) for 240 soybean varieties (a natural population formed by four provinces in China: Sichuan, Yunnan, Chongqing, and Guizhou). From these images, the morphological, color, and textural features of the soybeans were extracted. Subsequently, feature selection was performed on the image parameters using a Pearson correlation coefficient threshold ≥0.5. Five machine learning methods, namely, CatBoost, LightGBM, RF, GBDT, and MLP, were employed to establish soybean yield estimation models based on the individual and combined image parameters from the two perspectives extracted from RGB images. (1) GBDT is the optimal model for predicting soybean yield, with a test set R2 value of 0.82, an RMSE of 1.99 g/plant, and an MAE of 3.12%. (2) The fusion of multiangle and multitype indicators is conducive to improving soybean yield prediction accuracy. Therefore, this combination of parameters extracted from RGB images via machine learning has great potential for estimating soybean yield, providing a theoretical basis and technical support for accelerating the soybean breeding process.
ISSN:1664-462X
1664-462X
DOI:10.3389/fpls.2024.1395760