USGS Scientific Investigations Report 2012–5200  - Suspended-Sediment Characteristics of the Johnson Creek Basin, Oregon, Water Years 2007

Appendix C. Regression Model Evaluation

Seven linear regression models were evaluated for the Gresham and Milwaukie stations (table 2). Linear regression analysis requires normality of the resulting error distribution (Ott and Longnecker, 2001). Initially, the raw data sets used in the regression models were heavily skewed (a violation of normality). To minimize skew and to produce an error distribution approaching normality, the values of SSC, streamflow, and turbidity were transformed to base-10 logarithmic values. Other models were run with square root transformation or no transformation for comparative purposes.  

When appropriate, it is beneficial to use the same transformation and parameters for models at multiple stations, which tends to result in more congruent SSL computations between stations. The same basic model structure should be used at both stations unless diagnostic results warranted otherwise. 

Models were evaluated on the basis of two criteria: 

Diagnostic linear regression statistics, including:

Adjusted coefficient of determination (Adj R²) 
Root-mean-squared error (RMSE) 
Mean absolute error (MAE) 
Mean absolute percentage error (MAPE)

Evaluation of linear regression model residuals, including:

The Jarque-Berra test for normality (JB) 
The Breusch-Pagan test for heteroscedasticity (BP)

Linear regression diagnostic comparisons are meaningful only if the dependent variable is the same for all models. This assumption is violated when some model variables have been log- or square-root transformed. Therefore, when required, diagnostic values were transformed back into linear space before being evaluated. 

Diagnostic Linear Regression Statistics 

The coefficient of determination (R²) estimates the proportion of variability explained by the regression model. Similarly, the Adj R² estimates the proportion of variability explained by the regression model while accounting for the number of explanatory variables. RMSE is an unbiased estimator that quantifies the difference between values implied by an estimator and the true values of the quantity being estimated. MAE is a metric for measuring how far predicted values deviate from true values. MAPE expresses error in generic percentage terms. As regression models approach Adj R² values of 1.0, the models approach perfect correlation. Similarly, as regression models approach RMSE, MAE, and MAPE values of zero, the models approach perfect estimation (residual values of zero). 

For the Gresham station, models using streamflow and turbidity as independent variables had higher Adj R² values and lower RMSE, MAE, and MAPE values than models using a single independent variable (table 2). For the Milwaukie station, models using only turbidity as an independent variable performed as well as or better than models using both streamflow and turbidity as independent variables. However, the improvement in regression diagnostics gained by using a model with only turbidity as an independent variable at Milwaukie was much smaller than the overall advantage gained by using both independent variables at Gresham. Consequently, if the same basic model structure were to be maintained at both stations, the model using both independent variables would provide better overall results. 

Evaluation of Linear Regression Model Residuals 

One of the assumptions of linear regression is that the residual errors are normally distributed. Violations of this assumption compromise the estimation of coefficients and the calculation of prediction intervals. The Jarque-Bera (JB) test for normality (Jarque and Bera, 1980) was used on the residuals of each model. The JB test is a goodness-of-fit test that examines the skewness and kurtosis of a distribution and compares it to a matching normal distribution. The JB test statistic has a chi-squared distribution with two degrees of freedom. The P-values associated with the computed JB test statistic are shown in table 2. Using a significance level of 0.05, values greater than 0.95 suggest a statistically significant departure from normality in the distribution of residuals for the model. For the Gresham and Milwaukie stations, all models failed to reject the null hypothesis of normally distributed residuals. 

Linear regression models assume homoscedasticity (constant variance) of the resulting error distribution. Violations of the homoscedasticity assumption can result in inaccurate forecast error and prediction intervals. Violations also can result in too much weight given to a small subset of the data, such as the group of measurements with the largest SSC values. The Breusch-Pagan (BP) test can be used to measure heteroscedasticity in a linear regression model (Breusch and Pagan, 1979). The BP tests the residuals of an error distribution by regressing the squared residuals with the independent variables. The BP test is chi-squared with k degrees of freedom, where k is the number of independent variables. The P-values associated with the computed BP test statistic are shown in table 2. Values closer to 0 suggest a stronger departure from homoscedasticity in the distribution of residuals for the model. 

At the Gresham station, models 1, 2, and 5 failed to reject the null hypothesis of homoscedasticity at a significance level of 0.05. For all other models, the null hypothesis is rejected, and the model residual distribution is considered heteroscedastic. At the Milwaukie station, models 2, 3, and 5 failed to reject the null hypothesis of homoscedasticity at a significance level of 0.05. For all other models, the null hypothesis is rejected, and the model residual distribution is considered heteroscedastic. 

Selection of Model 

Models 2 and 5 were eliminated from consideration due to their relatively poor results from the linear regression diagnostic statistics. No models were eliminated based on the JB test. The BP test of homoscedasticity was rejected for most models. Models with turbidity as an independent variable appear to be less homoscedastic in their error distributions (table 2). However, models with turbidity as an independent variable tend to provide more accurate estimates (that is, lower RMSE, MAE, and MAPE) than models not employing turbidity as an independent variable. The extra accuracy gained by including turbidity as a regression variable far outweighs any diminished accuracy in forecasts and prediction intervals resulting from heteroscedasticity. Each diagnostic linear regression statistic was ranked for each station between models (for example, model 7 provided the lowest MAE value at the Gresham gaging station and was ranked first), and the average rank for each model computed. Model 6 was selected because it had the lowest average rank. 

References Cited 

Breusch, T.S., and Pagan, A.R., 1979, Simple test for heteroscedasticity and random coefficient variation: Econometrica (The Econometric Society), v. 47, no. 5, p. 1,287–1,294. 

Jarque, C.M., and Bera, A.K., 1980, Efficient tests for normality, homoscedasticity, and serial independence of regression residuals: Economics Letters, v. 6, no. 3, p. 255–259. 

Ott, R.L., and Longnecker, M., 2001, An introduction to statistical methods and data analysis: Pacific Grove, Calif., Wadsworth Group, 1,152 p.

Suspended-Sediment Characteristics of the Johnson Creek Basin, Oregon, Water Years 2007–10

Appendix C. Regression Model Evaluation

Diagnostic Linear Regression Statistics

Evaluation of Linear Regression Model Residuals

Selection of Model

References Cited

Suspended-Sediment Characteristics of the Johnson Creek Basin, Oregon, Water Years 2007–10 

Diagnostic Linear Regression Statistics 

Evaluation of Linear Regression Model Residuals 

Selection of Model 

References Cited