Predictive geochemical mapping using machine learning in western Kenya
Humphrey, Olivier S.; Cave, Mark; Hamilton, Elliott M.; Osano, Odipo; Menya, Diana; Watts, Michael J.. 2023 Predictive geochemical mapping using machine learning in western Kenya. Geoderma Regional, 35, e00731. 10.1016/j.geodrs.2023.e00731
Before downloading, please read NORA policies.Preview |
Text (Open Access Paper)
1-s2.0-S235200942300127X-main.pdf - Published Version Available under License Creative Commons Attribution 4.0. Download (5MB) | Preview |
Abstract/Summary
Digital soil mapping techniques represent a cost-effective method for obtaining detailed information regarding the spatial distribution of chemical elements in soils. Machine learning (ML) algorithms using random forest (RF) models have been developed for classification, pattern recognition and regression tasks, they are capable of modelling non-linear relationships using a range of datasets, identifying hierarchical relationships, and determining the importance of predictor variables. In this study, we describe a framework for spatial prediction based on RF modelling where inverse distance weighted (IDW) predictors are used in conjunction with ancillary environmental covariates. The model was applied to predict the total concentration (mg kg−1) and assess the prediction uncertainty of 56 elements, soil pH and organic matter content using 466 soil samples in western Kenya; the results of iodine (I), selenium (Se), zinc (Zn) and soil pH are highlighted in this work. These elements were selected due to contrasting biogeochemical cycles and widespread dietary deficiencies in sub-Saharan Africa, whilst soil pH is an important parameter controlling soil chemical reactions. Algorithm performance was evaluated determining the relative importance of each predictor variable and the model's response using partial dependence profiles. The accuracy and precision of each RF model were assessed by evaluating out-of-bag predicted values. The models R2 values range from 0.31 to 0.64 whilst CCC values range from 0.51 to 0.77. The IDW predictor variables had the greatest impact on assessing the distribution of soil properties in the study area, however, the inclusion of ancillary environmental data improved model performance for all soil properties. The results presented in this paper highlight the benefits of ML algorithms which can incorporate multiple layers of data for spatial prediction, uncertainty assessment and attributing variable importance. Additional research is now required to ensure health practitioners and the agri-community utilise the geochemical maps presented here for assessing the relationship between environmental geochemistry, endemic diseases and preventable micronutrient deficiency.
Item Type: | Publication - Article |
---|---|
Digital Object Identifier (DOI): | 10.1016/j.geodrs.2023.e00731 |
ISSN: | 23520094 |
Additional Keywords: | IGRD |
Date made live: | 04 Dec 2023 14:33 +0 (UTC) |
URI: | https://nora.nerc.ac.uk/id/eprint/536388 |
Actions (login required)
View Item |
Document Downloads
Downloads for past 30 days
Downloads per month over past year