Loading…

Application, interpretability and prediction of machine learning method combined with LSTM and LightGBM-a case study for runoff simulation in an arid area

[Display omitted] •A machine learning-based runoff prediction method is proposed.•Integrating LSTM and LightGBM model can effectively improve the prediction accuracy in arid areas.•Interpretive analysis is undertaken to identify key feature variables.•Future annual runoff is predicted under 12 clima...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of hydrology (Amsterdam) 2023-10, Vol.625, p.130091, Article 130091
Main Authors:	Bian, Lekang, Qin, Xueer, Zhang, Chenglong, Guo, Ping, Wu, Hui
Format:	Article
Language:	English
Subjects:	Arid area case studies climate flood control Interpretability Light gradient boosting machine Long short-term memory neural networks prediction rivers runoff Runoff prediction The reciprocal error method watersheds
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	[Display omitted] •A machine learning-based runoff prediction method is proposed.•Integrating LSTM and LightGBM model can effectively improve the prediction accuracy in arid areas.•Interpretive analysis is undertaken to identify key feature variables.•Future annual runoff is predicted under 12 climate scenarios for the next 50 years. The runoff prediction can provide scientific basis for flood control, disaster reduction and water resources planning. Due to a large number of uncertainties in runoff prediction, it is difficult to make precise predictions. To improve the accuracy of runoff prediction, this study combines techniques of Long Short-Term Memory (LSTM) and Light Gradient Boosting Machine (LightGBM) in machine learning with reciprocal error method to develop an integrated data-driven model (i.e., LSTM-LightGBM) for runoff prediction. To demonstrate its applicability, the model is applied to the annual runoff prediction of the Caiqi hydrological monitoring station in the Shiyang River in an arid area. Indicators include Error of Peak (EP), Nash-Sutcliffe Efficiency (NSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) are adopted to evaluate the prediction performance of the LSTM, LightGBM, and LSTM-LightGBM methods under the same hyperparameter combinations. Then, the interpretability of LSTM and LightGBM models is also explored based on the permutation importance method and Shapley Additive exPlanations (SHAP) values, respectively. Finally, future annual runoff at the Caiqi for the next 50 years (2025–2075) is predicted based on LSTM-LightGBM model under 12 climate scenarios. Therefore, results show that: 1. the integrated model (LSTM-LightGBM) has good performance than two single models in NSE (0.92), RMSE (0.075 million m3) and MAE (0.046 million m3) and EP value (i.e., for bridging the peak-valley runoff). 2. In this case, it is found that four feature variables have the greatest influence on the target variables through the interpretable analysis. 3. The 12 combined climate scenarios used in this investigation produced generally steady predictions. The scenarios with the highest and lowest mean values are GFDL RCP 6.0 (3.12 × 108m3) and IPSL RCP 2.6 (3.04 × 108m3), respectively, with a decrease of 24.09 % and 26.03 % compared to the mean annual runoff of 4.11 × 108m3 in the baseline period (1955–2017). These findings can provide scientific bases for future water resources planning in the downstream of the Shiyang River Basin.
ISSN:	0022-1694 1879-2707
DOI:	10.1016/j.jhydrol.2023.130091