Skip Links

USGS - science for a changing world

Scientific Investigations Report 2010–5008

Use of Continuous Monitors and Autosamplers to Predict Unmeasured Water-Quality Constituents in Tributaries of the Tualatin River, Oregon

Table 15. Preliminary model statistics for correlation of Escherichia coli bacteria with continuous parameters at Dairy Creek at Highway 8 near Hillsboro, Oregon, 2002–04.

[Regression models are of the form E. coli = a*Turb+ b*Q + c*SC + d, where a, b, and c are model coefficients and d is the intercept, Turb, Q, and SC are the explanatory variables turbidity (in Formazin Nephelometric Units), discharge (in cubic feet per second), and specific conductance (in microsiemens per centimeter), respectively, and E. coli is the dependent variable, in colonies per 100 milliliters. Where E. coli is log transformed, a bias transformation factor (BCF; Duan, 1983) is multiplied by 10(logE. coli ) to get the final value. Model 7b was not included in the model selection scheme because stage and discharge are surrogates for one another, but model 7b was built separately to evaluate the use of stage as an independent variable to compensate for backwater effects on discharge at the Dairy Creek site. RMSE values are in colonies per 100 milliliters. The maximum Variance Inflation Factor (VIF) indicates the largest VIF obtained for any one variable in the correlation. Abbreviations: E. coli, Esherichia coli bacteria; n, number of samples; Adj.-R2, adjusted R2, a coefficient of determination which adjusts for degrees of freedom and penalizes the use of too many explanatory variables; f, a function of indicated constituents; log, base 10 logarithm; RMSE, root mean square error; NA, VIFs are not applicable because only one independent variable was used]

Model No.
and form
Model calibration Model validation—Goodness-of-fit evaluation
Value of coefficient, when used Correlation statistics
a b c d BCF n Adj.-R2 Maximum VIF Mean error Validation RMSE Coefficient of determination Nash-Sutcliffe coefficient z-statistic from sign test
Scenario 1 Calibration data set—Autosamplers only Validation data set—Clean Water Services ambient monitoring data
1. logE. coli =f(Turb, SC) 0.035 -0.025 5.14 1.05 37 0.871 1.3 185 694 0.10 -4.1 0.31
2. logE. coli =f(Turb, Q, SC) 0.036 -0.0001 -0.025 5.15 1.05 37 0.867 2.9 118 735 0.09 -3.3 0.63
3. logE. coli =f(Q, SC) 0.005 -0.029 5.71 1.12 38 0.715 1.2 11 653 0.10 -2.4 1.1
4. logE. coli =f(Turb) 0.049 1.69 1.1 37 0.695 NA -122 331 0.01 -0.15 0.93
Scenario 2 Calibration data set—Peak autosampler plus first monthly and high flow
monitoring samples from Clean Water Services dataset
Validation data set—Remaining monthly Clean Water Services
ambient monitoring + non-peak autosampler data
5. logE. coli =f(SC) 0.013 0.713 1.29 17 0.342 NA 94 567 0.02 -3.2 1.2
6. logE. coli =f(Turb, SC) 0.017 0.013 0.497 1.45 17 0.309 1.0 161 667 0.03 -4.8 2.4
7a. logE. coli =f(Q, SC) -0.0004 0.011 0.981 1.37 16 0.378 1.2 19 329 0.02 0.13 1.8
7b. logE. coli =f(Stage, SC) -0.0001 0.011 0.828 1.36 32 0.391 11.4 -2 331 0.02 0.21 0.9
8. logE. coli =f(Turb, Q, SC) 0.017 0.00001 0.013 0.494 1.45 17 0.256 11.2 71 338 0.07 -0.26 2

1 Exceeds a threshold VIF value, calculated as {1/(1– (Adj.-R2)} and indicates possible multicollinearity.

First posted June 18, 2010

For additional information contact:
Director, Oregon Water Science Center
U.S. Geological Survey
2130 SW 5th Ave.
Portland, Oregon 97201
http://or.water.usgs.gov

Part or all of this report is presented in Portable Document Format (PDF); the latest version of Adobe Reader or similar software is required to view it. Download the latest version of Adobe Reader, free of charge.

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http:// pubsdata.usgs.gov /pubs/sir/2010/5008/table15.html
Page Contact Information: Contact USGS
Page Last Modified: Thursday, 10-Jan-2013 19:12:08 EST