U.S. Geological Survey Scientific Investigations Report 2009-5268
Trends in Water Quality in the Southeastern United States, 1973–2005
Trend Associations with Ancillary Data
A comparison of variation in annual median constituent concentrations with annual variation in some landscape alterations that could affect streamwater quality suggests possible causes of trends in streamwater quality. The basin landscape variables, which include annual farm and nonfarm fertilizer sales estimates, atmospheric nitrogen deposition, runoff, and a variety of annual crop- and animal-production variables, were expressed by unit area for each basin. However, no data were available on important basin variables that may affect water quality, including changes in basin urbanization, municipal and industrial wastewater inputs, and basin land-management practices.
Multiple regression analysis was used to relate the dependent water-quality variables to multiple independent ancillary variables to assess which of the ancillary variables would be useful in prediction models. Annual median values of water-quality constituents and physical properties were regressed with annual values of the basin ancillary landscape and agricultural variables for the 44 NWIS sites. The multiple regression analysis was limited to the ancillary variables with the greatest number of values, including year; nitrogen fertilizer application from fertilizer sales data; annual runoff; corn, soybean, tobacco, and wheat harvest; and population density of beef cattle and hogs. Discrete dummy variables were coded for each site to incorporate the spatial variation of site location in the regression models.
Regression analysis results for the NWIS data for the ancillary variables included in models with the highest coefficient of determination (R2) and lowest Mallows’ Cp (Snedecor and Cochran, 1980) value for each water-quality property or constituent are given in table 2 (Excel file). Coefficient of determination values for these models, which give the fraction of the variance explained by regression, range from 0.41 to 0.98. The table is shown to illustrate patterns that occurred in many iterations of possible multiple-regression models for the NWIS data.
A few distinct patterns are evident in the models selected in the multiple- regression analysis. The variables for nitrogen fertilizer, atmospheric nitrogen deposition, corn harvest, and tobacco harvest were each selected in 10 of 22 constituent regression models. Beef cattle population density was selected for 18 of the 22 models developed. Runoff, corn harvest, and beef cattle were selected for the specific conductance, alkalinity, dissolved oxygen, calcium, sodium, and total phosphorus models. Tobacco was selected as an independent variable for inclusion in all nutrient models except total phosphorus, and the beef cattle variable was selected for inclusion in all of the nutrient models except total nitrogen.