ARCHI: A new R package for automated imputation of regionally correlated hydrologic records
Links
- More information: Publisher Index Page (via DOI)
- Data Release: USGS data release - Example Groundwater-Level Datasets and Benchmarking Results for the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) Software Package
- Open Access Version: Publisher Index Page
- Download citation as: RIS | Dublin Core
Abstract
Missing data in hydrological records can limit resource assessment, process understanding, and predictive modeling. Here, we present ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation), a new, open-source software package in R designed to aggregate, impute, cluster, and visualize regionally correlated hydrologic records. ARCHI imputes missing data in “target” records by linear regression using more complete “reference” records as predictors. Automated imputation is implemented using a novel, iterative algorithm that allows each site to be considered a target or reference for regression, growing the pool of complete references with each imputed record until viable gap-filling ceases. Users can limit artifacts from spurious correlations by specifying model-acceptance criteria and applying geospatial, correlation, and group-based filters to control reference selection. ARCHI provides additional functions for visualizing results, clustering records with similar correlation structures, evaluating holdout data, and interactive parameterization with an accessible and intuitive graphical user interface (GUI). This methods brief provides an overview of the ARCHI package, modeling guidelines, and benchmarking on two regional groundwater-level datasets from the Central Valley, CA and Long Island, NY. We evaluate ARCHI alongside widely used multivariate imputation software to highlight and contextualize its computational efficiency, imputation accuracy, and model transparency when applied to large, groundwater-level datasets.
Study Area
| Publication type | Article |
|---|---|
| Publication Subtype | Journal Article |
| Title | ARCHI: A new R package for automated imputation of regionally correlated hydrologic records |
| Series title | Groundwater |
| DOI | 10.1111/gwat.13474 |
| Volume | 62 |
| Issue | 4 |
| Publication Date | February 28, 2025 |
| Year Published | 2025 |
| Language | English |
| Publisher | Wiley |
| Contributing office(s) | California Water Science Center |
| Description | 16 p. |
| First page | 595 |
| Last page | 610 |
| Country | United States |
| State | California, New York |
| Other Geospatial | Central Valley, Long Island |