Loading…

Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra

No single instrument can characterize all soil properties because soil is a complex material. With the advancement of technology, laboratories have become equipped with various spectrometers. By fusing output from different spectrometers, better prediction outcomes are expected than using any single...

Full description

Saved in:
Bibliographic Details
Published in:Geoderma 2019-10, Vol.352, p.251-267
Main Authors: Ng, Wartini, Minasny, Budiman, Montazerolghaem, Maryam, Padarian, Jose, Ferguson, Richard, Bailey, Scarlett, McBratney, Alex B.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:No single instrument can characterize all soil properties because soil is a complex material. With the advancement of technology, laboratories have become equipped with various spectrometers. By fusing output from different spectrometers, better prediction outcomes are expected than using any single spectrometer alone. In this study, model performance from a single spectrometer (visible-near-infrared spectroscopy, vis-NIR or mid-infrared spectroscopy, MIR) was compared to the combined spectrometers (vis-NIR and MIR). We selected a total of 14,594 samples from the Kellogg Soil Survey Laboratory (KSSL) database that had both vis-NIR and MIR spectra along with measurements of sand, clay, total C (TC) content, organic C (OC) content, cation exchange capacity (CEC), and pH. The dataset was randomly split into 75% training (n = 10,946) and the remaining (n = 3,648) as a test set. Prediction models were constructed with partial least squares regression (PLSR) and Cubist tree model. Additionally, we explored the use of a deep learning model, the convolutional neural network (CNN). We investigated various ways to feed spectral data to the CNN, either as one-dimensional (1D) data (as a spectrum) or as two-dimensional (2D) data (as a spectrogram). Compared to the PLSR model, we found that the CNN model provides an average improvement prediction of 33–42% using vis-NIR and 30–43% using MIR spectral data input. The relative accuracy improvement of CNN, when compared to the Cubist regression tree model, ranged between 22 and 36% with vis-NIR and 16–27% with MIR spectral data input. Various methods to fuse the vis-NIR and MIR spectral data were explored. We compared the performance of spectral concatenation (for PLSR and Cubist model), two-channel input method, and outer product analysis (OPA) method (for CNN model). We found that the performance of two-channel 1D CNN model was the best (R2 = 0.95–0.98) followed closely by the OPA with CNN (R2 = 0.93–0.98), Cubist model with spectral concatenation (R2 = 0.91–0.97), two-channel 2D CNN model (R2 = 0.90–0.95) and PLSR with spectral concatenation (R2 = 0.87–0.95). Chemometric analysis of spectroscopy data relied on spectral pre-processing methods: such as spectral trimming, baseline correction, smoothing, and normalization before being fed into the model. CNN achieved higher performance than the PLSR and Cubist model without utilizing the pre-processed spectral data. We also found that the predictions using the CNN model ret
ISSN:0016-7061
1872-6259
DOI:10.1016/j.geoderma.2019.06.016