Descriptive inference using large, unrepresentative nonprobability samples: an introduction for ecologists

Boyd, Robin J. ORCID: https://orcid.org/0000-0002-7973-9865; Stewart, Gavin B.; Pescott, Oliver L. ORCID: https://orcid.org/0000-0002-0685-8046. 2024 Descriptive inference using large, unrepresentative nonprobability samples: an introduction for ecologists. Ecology, 105 (2), e4214. 14, pp. https://doi.org/10.1002/ecy.4214

Before downloading, please read NORA policies.

Preview

Text
N536554JA.pdf - Published Version
Available under License Creative Commons Attribution 4.0.
Download (1MB) | Preview

Official URL: http://dx.doi.org/10.1002/ecy.4214

Abstract/Summary

Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and non-sampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of “auxiliary variables”. A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment—subsampling, quasi-randomisation, poststratification, superpopulation modelling, a “doubly robust” procedure, and multilevel regression and poststratification—to a simple two-part biodiversity monitoring problem. The first part was to estimate mean occupancy of the plant Calluna vulgaris in Great Britain in two time-periods (1987-1999 and 2010-2019); the second was to estimate the difference between the two (i.e. the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared to the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported.

Item Type:	Publication - Article
Digital Object Identifier (DOI):	https://doi.org/10.1002/ecy.4214
UKCEH and CEH Sections/Science Areas:	Biodiversity (Science Area 2017-)
ISSN:	0012-9658
Additional Information. Not used in RCUK Gateway to Research.:	Open Access paper - full text available via Official URL link.
Additional Keywords:	bias, biodiversity monitoring, nonprobability samples, weighting
NORA Subject Terms:	Ecology and Environment Data and Information
Related URLs:	Dataset
Date made live:	16 Jan 2024 09:51 +0 (UTC)
URI:	https://nora.nerc.ac.uk/id/eprint/536554

Item Type:

Publication - Article

Digital Object Identifier (DOI):

https://doi.org/10.1002/ecy.4214

UKCEH and CEH Sections/Science Areas:

Biodiversity (Science Area 2017-)

ISSN:

0012-9658

Additional Information. Not used in RCUK Gateway to Research.:

Open Access paper - full text available via Official URL link.

Additional Keywords:

bias, biodiversity monitoring, nonprobability samples, weighting