Reproducibility starts at the source: R, Python, and Julia Packages for retrieving USGS hydrologic data

By: , and 



Much of modern science takes place in a computational environment, and, increasingly, that environment is programmed using R, Python, or Julia. Furthermore, most scientific data now live on the cloud, so the first step in many workflows is to query a cloud database and load the response into a computational environment for further analysis. Thus, tools that facilitate programmatic data retrieval represent a critical component in reproducible scientific workflows. Earth science is no different in this regard. To fulfill that basic need, we developed R, Python, and Julia packages providing programmatic access to the U.S. Geological Survey’s National Water Information System database and the multi-agency Water Quality Portal. Together, these packages create a common interface for retrieving hydrologic data in the Jupyter ecosystem, which is widely used in water research, operations, and teaching. Source code, documentation, and tutorials for the packages are available on GitHub. Users can go there to learn, raise issues, or contribute improvements within a single platform, which helps foster better engagement and collaboration between data providers and their users.
Publication type Article
Publication Subtype Journal Article
Title Reproducibility starts at the source: R, Python, and Julia Packages for retrieving USGS hydrologic data
Series title Water
DOI 10.3390/w15244236
Volume 15
Issue 24
Year Published 2023
Language English
Publisher MDPI
Contributing office(s) Central Midwest Water Science Center, WMA - Integrated Information Dissemination Division
Description 4236, 10 p.
Google Analytic Metrics Metrics page
Additional publication details