nerc.ac.uk

Fast‐tracking ecological interpretation using bespoke quantitative large language models

Gallois, Elise C. ORCID: https://orcid.org/0000-0002-9402-1931; Salili‐James, Arianna ORCID: https://orcid.org/0000-0003-1125-2054; Poon, Sanson T.S. ORCID: https://orcid.org/0000-0001-5297-7452; Trebski, Artur ORCID: https://orcid.org/0009-0006-3259-5215; Redding, David W. ORCID: https://orcid.org/0000-0001-8615-1798. 2025 Fast‐tracking ecological interpretation using bespoke quantitative large language models. Methods in Ecology and Evolution. 10.1111/2041-210X.70184

Before downloading, please read NORA policies.
[thumbnail of Methods Ecol Evol - 2025 - Gallois - Fast‐tracking ecological interpretation using bespoke quantitative large language.pdf]
Preview
Text
Methods Ecol Evol - 2025 - Gallois - Fast‐tracking ecological interpretation using bespoke quantitative large language.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (1MB) | Preview

Abstract/Summary

•1. The Anthropocene presents significant challenges for global biodiversity, public health and ecosystem stability. The wealth of publicly available near‐real‐time ecology and climate data can be used to monitor these challenges and allow practitioners to develop mitigation strategies. •2. There is untapped potential to apply large language models (LLMs) to quantitative ecological and environmental datasets, enabling researchers and practitioners to use natural language queries to transform ecological observations into actionable insights for both conservation action and communication of results to diverse audiences. Advances in artificial intelligence (AI), and particularly in LLMs, offer emerging opportunities to address these challenges. LLMs are increasingly proficient at identifying patterns and semantic relationships within textual data and are highly customisable. Accessible AI tools can facilitate communication across research and policy sectors. •3. Here, we present a roadmap for designing and implementing multi‐modal LLMs to answer ecological research questions. To build robust ‘virtual quantitative assistants’ capable of fast‐tracking data interpretation, we advocate for strategic planning, data stewardship practices, careful prompt engineering and model evaluation as key steps in the LLM development process. •4. We discuss potential use‐case examples that apply the LangChain framework to analyse citizen science data. Using our LLM roadmap, we highlight the importance of iterative and strategic prompt engineering and agent selection, in addition to iteratively evaluating model output. As LLM software continues to evolve, its integration into ecological and environmental research can empower ecologists with purpose‐built tools that bridge the gap between data collection and actionable solutions.

Item Type: Publication - Article
Digital Object Identifier (DOI): 10.1111/2041-210X.70184
UKCEH and CEH Sections/Science Areas: Environmental Pressures and Responses (2025-)
ISSN: 2041-210X
Additional Information: Open Access paper - full text available via Official URL link.
Additional Keywords: artificial intelligence, citizen science, large language models, multi-agent models, natural language processing
NORA Subject Terms: Ecology and Environment
Computer Science
Data and Information
Date made live: 06 Nov 2025 11:09 +0 (UTC)
URI: https://nora.nerc.ac.uk/id/eprint/540512

Actions (login required)

View Item View Item

Document Downloads

Downloads for past 30 days

Downloads per month over past year

More statistics for this item...