Sources of Sodium and Chloride in the Scituate Reservoir Drainage Basin, Rhode Island

Water-Resources Investigations Report 02-4149

Regression Analysis

Regression Analysis Regression analysis is a statistical technique that provides an equation describing the nature of the relation between two measured variables. In simple regression, the equation predicts the value of one of the variables (the dependent variable, or “y”) on the basis of the value of one other variable (the independent variable, or “x”). In this study, the dependent variables were the median concentrations of sodium and chloride in streams that supply water to the Scituate Reservoir. The independent variables were the densities of State-maintained and locally maintained roads in the subbasins supplying the streams. The simplest relation between the two variables, that of a straight line in which the value of the dependent variable changes in linearproportion to a change in the value of the independent variable, was assumed. The resulting equation describes the “best-fitting” straight line, in the sense that the distances between the line and each of the data points are minimized.

Simple regression analysis can also be used to measure the accuracy with which the regression equation predicts values of the dependent variable the basis of the value of the independent variable. This measure, known as is the proportion of the variation in dependent variable that can be accounted for by variations in the independent variable. For example, an R2 of 0.62 for the equation relating median stream-sodium concentration to the density of State-maintained roads in the subbasins supplying the streams (fig. 4) indicates that 62 percent of the variation in sodium concentration is accounted for by the variation in road density. A second measure of the reliability of the regression equation is obtained by asking whether or not the observed relation could have appeared by chance alone. The p-values presented in figures 4-6 give the probabilities that a linear relation between the two variables could have arisen by chance. In all cases, the probabilities are small (less than 0.01 percent) that the strong positive relations between stream sodium and chloride concentrations and the densities of State-maintained roads are simply due to chance arrangements of the data, whereas the probabilities that the observed relations involving the two constituents and the densities of locally maintained roads are due to chance are about 81 percent.

