Fast and flexible Bayesian species distribution modelling using Gaussian processes

Golding, Nick; Purse, Bethan V. ORCID: 2016 Fast and flexible Bayesian species distribution modelling using Gaussian processes. Methods in Ecology and Evolution, 7 (5). 598-608.

Before downloading, please read NORA policies.
N513370JA.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (980kB) | Preview


1. Species distribution modelling (SDM) is widely used in ecology, and predictions of species distributions inform both policy and ecological debates. Therefore, methods with high predictive accuracy and those that enable biological interpretation are preferable. Gaussian processes (GPs) are a highly flexible approach to statistical modelling and have recently been proposed for SDM. GP models fit smooth, but potentially complex response functions that can account for high-dimensional interactions between predictors. We propose fitting GP SDMs using deterministic numerical approximations, rather than Markov chain Monte Carlo methods in order to make GPs more computationally efficient and easy to use. 2. We introduce GP models and their application to SDM, illustrate how ecological knowledge can be incorporated into GP SDMs via Bayesian priors and formulate a simple GP SDM that can be fitted efficiently. This model can be fitted either by learning the hyperparameters or by using a fixed approximation to them. Using a subset of the North American Breeding Bird Survey data set, we compare the out-of-sample predictive accuracy of these models with several commonly used SDM approaches for both presence/absence and presence-only data. 3. Predictive accuracy of GP SDMs fitted by Laplace approximation was greater than boosted regression trees, generalized additive models (GAMs) and logistic regression when trained on presence/absence data and greater than all of these models plus MaxEnt when trained on presence-only data. GP SDMs fitted using a fixed approximation to hyperparameters were no less accurate than those with MAP estimation and on average 70 times faster, equivalent in speed to GAMs. 4. As well as having strong predictive power for this data set, GP SDMs offer a convenient method for incorporating prior knowledge of the species' ecology. By fitting these methods using efficient numerical approximations, they may easily be applied to large data sets and automatically for many species. An r package, GRaF, is provided to enable SDM users to fit GP models.

Item Type: Publication - Article
Digital Object Identifier (DOI):
UKCEH and CEH Sections/Science Areas: Reynard
ISSN: 2041-210X
Additional Information. Not used in RCUK Gateway to Research.: Open Access paper - full text available via Official URL link.
Additional Keywords: boosted regression trees, Gaussian processes, generalized additive models, MaxEnt, species distribution models
NORA Subject Terms: Ecology and Environment
Date made live: 04 Apr 2016 14:06 +0 (UTC)

Actions (login required)

View Item View Item

Document Downloads

Downloads for past 30 days

Downloads per month over past year

More statistics for this item...