Abstract
Statistical models of nitrate occurrence in the glacial
aquifer system of the northern United States, developed by the
U.S. Geological Survey, use observed relations between nitrate
concentrations and sets of explanatory variables—representing
well-construction, environmental, and source characteristics—
to predict the probability that nitrate, as nitrogen, will exceed a
threshold concentration. However, the models do not explicitly
account for the processes that control the transport of nitrogen
from surface sources to a pumped well and use area-weighted
mean spatial variables computed from within a circular buffer
around the well as a simplified source-area conceptualization.
The use of models that explicitly represent physical-transport
processes can inform and, potentially, improve these statistical
models. Specifically, groundwater-flow models simulate
advective transport—predominant in many surficial aquifers—
and can contribute to the refinement of the statistical models
by (1) providing for improved, physically based representations
of a source area to a well, and (2) allowing for more
detailed estimates of environmental variables.
A source area to a well, known as a contributing recharge
area, represents the area at the water table that contributes
recharge to a pumped well; a well pumped at a volumetric rate
equal to the amount of recharge through a circular buffer will
result in a contributing recharge area that is the same size as
the buffer but has a shape that is a function of the hydrologic
setting. These volume-equivalent contributing recharge areas
will approximate circular buffers in areas of relatively flat
hydraulic gradients, such as near groundwater divides, but in
areas with steep hydraulic gradients will be elongated in the
upgradient direction and agree less with the corresponding
circular buffers.
The degree to which process-model-estimated contributing
recharge areas, which simulate advective transport and
therefore account for local hydrologic settings, would inform
and improve the development of statistical models can be
implicitly estimated by evaluating the differences between
explanatory variables estimated from the contributing recharge
areas and the circular buffers used to develop existing statistical
models. The larger the difference in estimated variables,
the more likely that statistical models would be changed, and
presumably improved, if explanatory variables estimated from
contributing recharge areas were used in model development.
Comparing model predictions from the two sets of estimated
variables would further quantify—albeit implicitly—how an
improved, physically based estimate of explanatory variables
would be reflected in model predictions. Differences between
the two sets of estimated explanatory variables and resultant
model predictions vary spatially; greater differences are
associated with areas of steep hydraulic gradients. A direct
comparison, however, would require the development of a
separate set of statistical models using explanatory variables
from contributing recharge areas.
Area-weighted means of three environmental variables—silt content, alfisol content, and depth to water from the U.S.
Department of Agriculture State Soil Geographic (STATSGO)
data—and one nitrogen-source variable (fertilizer-application
rate from county data mapped to Enhanced National Land
Cover Data 1992 (NLCDe 92) agricultural land use) can vary
substantially between circular buffers and volume-equivalent
contributing recharge areas and among contributing recharge
areas for different sets of well variables. The differences in
estimated explanatory variables are a function of the same
factors affecting the contributing recharge areas as well as
the spatial resolution and local distribution of the underlying
spatial data. As a result, differences in estimated variables
between circular buffers and contributing recharge areas
are complex and site specific as evidenced by differences
in estimated variables for circular buffers and contributing
recharge areas of existing public-supply and network wells
in the Great Miami River Basin. Large differences in areaweighted
mean environmental variables are observed at the
basin scale, determined by using the network of uniformly
spaced hypothetical wells; the differences have a spatial
pattern that generally is similar to spatial patterns in the
underlying STATSGO data. Generally, the largest differences
were observed for area-weighted nitrogen-application rate
from county and national land-use data; the basin-scale
differences ranged from -1,600 (indicating a larger value from
within the volume-equivalent contributing recharge area) to
1,900 kilograms per year (kg/yr); the range in the underlying spatial data was from 0 to 2,200 kg/yr. Silt content, alfisol
content, and nitrogen-application rate are defined by the
underlying spatial data and are external to the groundwater
system; however, depth to water is an environmental variable
that can be estimated in more detail and, presumably, in a
more physically based manner using a groundwater-flow
model than using the spatial data. Model-calculated depths to
water within circular buffers in the Great Miami River Basin
differed substantially from values derived from the spatial data
and had a much larger range.
Differences in estimates of area-weighted spatial variables
result in corresponding differences in predictions of
nitrate occurrence in the aquifer. In addition to the factors
affecting contributing recharge areas and estimated explanatory
variables, differences in predictions also are a function
of the specific set of explanatory variables used and the fitted
slope coefficients in a given model. For models that predicted
the probability of exceeding 1 and 4 milligrams per liter as
nitrogen (mg/L as N), predicted probabilities using variables
estimated from circular buffers and contributing recharge
areas generally were correlated but differed significantly at the
local and basin scale. The scale and distribution of prediction
differences can be explained by the underlying differences in
the estimated variables and the relative weight of the variables
in the statistical models. Differences in predictions of exceeding
1 mg/L as N, which only includes environmental variables,
generally correlated with the underlying differences in
STATSGO data, whereas differences in exceeding 4 mg/L as
N were more spatially extensive because that model included
environmental and nitrogen-source variables. Using depths
to water from within circular buffers derived from the spatial
data and depths to water within the circular buffers calculated
from the groundwater-flow model, restricted to the same
range, resulted in large differences in predicted probabilities.
The differences in estimated explanatory variables between
contributing recharge areas and circular buffers indicate incorporation
of physically based contributing recharge area likely
would result in a different set of explanatory variables and an
improved set of statistical models.
The use of a groundwater-flow model to improve
representations of source areas or to provide more-detailed
estimates of specific explanatory variables includes a number
of limitations and technical considerations. An assumption
in these analyses is that (1) there is a state of mass balance
between recharge and pumping, and (2) transport to a pumped
well is under a steady state flow field. Comparison of volumeequivalent
contributing recharge areas under steady-state and
transient transport conditions at a location in the southeastern
part of the basin shows the steady-state contributing recharge
area is a reasonable approximation of the transient contributing
recharge area after between 10 and 20 years of pumping.
The first assumption is a more important consideration for
this analysis. A gradient effect refers to a condition where
simulated pumping from a well is less than recharge through
the corresponding contributing recharge area. This generally
takes place in areas with steep hydraulic gradients, such as
near discharge locations, and can be mitigated using a finer
model discretization. A boundary effect refers to a condition
where recharge through the contributing recharge area is less
than pumping. This indicates other sources of water to the
simulated well and could reflect a real hydrologic process.
In the Great Miami River Basin, large gradient and boundary
effects—defined as the balance between pumping and
recharge being less than half—occurred in 5 and 14 percent
of the basin, respectively. The agreement between circular
buffers and volume-equivalent contributing recharge areas,
differences in estimated variables, and the effect on statisticalmodel
predictions between the population of wells with a
balance between pumping and recharge within 10 percent
and the population of all wells were similar. This indicated
process-model limitations did not affect the overall findings in
the Great Miami River Basin; however, this would be model
specific, and prudent use of a process model needs to entail a
limitations analysis and, if necessary, alterations to the model.