Loading…

Multiclass spatial predictions of borehole yield in southern Mali by means of machine learning classifiers

Regions of Bamako, Kati and Kangaba, southwestern Mali Machine learning-based mapping of borehole yield. Three algorithms were trained on an imbalanced multiclass database of boreholes, while twenty variables were used as predictors for borehole yield. All models returned balanced and geometric scor...

Full description

Saved in:
Bibliographic Details
Published in:Journal of hydrology. Regional studies 2022-12, Vol.44, p.101245, Article 101245
Main Authors: Gómez-Escalonilla, Diancoumba, O., Traoré, D.Y., Montero, E., Martín-Loeches, M., Martínez-Santos, P.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Regions of Bamako, Kati and Kangaba, southwestern Mali Machine learning-based mapping of borehole yield. Three algorithms were trained on an imbalanced multiclass database of boreholes, while twenty variables were used as predictors for borehole yield. All models returned balanced and geometric scores in the order of 0.80, with area under the receiver operating characteristic curve up to 0.87. Three main methodological conclusions are drawn: (a) The evaluation of different machine learning classifiers and various resampling strategies and the subsequent selection of the best performing ones is shown to be a good strategy in this type of studies; (b) ad hoc calibration tools, such as data on borehole success rates, provide an apt complement to standard machine learning metrics; and (c) a multiclass approach with an unbalanced database represents a greater challenge than predicting a bivariate outcome, but potentially results in a finer depiction of field conditions. Alluvial sediments were found to be the most productive areas, while the Mandingue Plateau has the lowest groundwater potential. The piedmont areas showcase an intermediate groundwater prospect. Elevation, basement depth, slope and geology rank among the most important variables. Lower values of clay content, slopes and elevations, and higher values of basement depth and saturated thickness were linked to the most productive class. [Display omitted] •ML-based groundwater potential maps were produced with an imbalanced multiclass dataset.•The 3 best models combined with the ADASYN resampling strategy showed scores around 0.80 in the predictions.•Elevation, basement depth, slope and geology were among the most important variables.•Alluvial sediments and piedmont areas were found to be the most productive areas.
ISSN:2214-5818
2214-5818
DOI:10.1016/j.ejrh.2022.101245