Incorporating data sets with multiple sources of uncertainty in integrated species distribution models

Ecology and Evolution
By: , and 

Links

Abstract

Data integration methods aim to improve species distribution estimates by incorporating multiple sources of uncertainty across datasets. Two major sources of uncertainty are: (1) variation in sampling effort across space and within datasets, and (2) variation in reliability associated with data collection protocols or timing among datasets. Our goal was to evaluate how different approaches to address these uncertainties influence predictive performance of integrated models. We modeled distributions of four bird species using three datasets that differed in sampling design. We examined three strategies to reduce uncertainty: (1) filtering data, (2) incorporating functions that account for uncertainty in observation models, and (3) varying how datasets are integrated into a single estimate. We first examine methods to account for variable effort in observations, focusing on both spatial differences in sampling intensity and effort given to a single observation record. We then examine approaches to account for data sets with differing reliability. Sampling effort was best addressed through conservative filtering, including spatial thinning and excluding observations with highly variable effort. Next, we considered how to account for potential false positive detections—due to either misidentification or changes in distributions. We found that treating less reliable data as a covariate, an approach previously suggested for data integration that can greatly speed up model fitting, performed well. Other effective approaches included directly modeling false positive rates and complete exclusion of less reliable data sets. Our results provide insights into best practices in integrated modeling for handling uncertainty in integrated models. We demonstrate the flexible options available when using integrated models to address uncertainty.

Suggested Citation

Lunt, F., Scher, C.L., Mummah, R.O., and Miller, D.A., 2026, Incorporating data sets with multiple sources of uncertainty in integrated species distribution models: Ecology and Evolution, v. 16, no. 4, e73185, 11 p., https://doi.org/10.1002/ece3.73185.

Study Area

Publication type Article
Publication Subtype Journal Article
Title Incorporating data sets with multiple sources of uncertainty in integrated species distribution models
Series title Ecology and Evolution
DOI 10.1002/ece3.73185
Volume 16
Issue 4
Publication Date April 09, 2026
Year Published 2026
Language English
Publisher Wiley
Contributing office(s) Eastern Ecological Science Center
Description e73185, 11 p.
Country United States
State Pennsylvania
Additional publication details