Effects of Urbanization on Stream Ecosystems in the Willamette River Basin and Surrounding Area, Oregon and Washington

Scientific Investigations Report 2006–5101–D

Data Reduction and Analysis

Pesticide Toxicity Index Calculations

To supplement the spatial comparison of individual pesticides among sites, a pesticide toxicity index (PTI) was calculated for each stream water sample (Munn and Gilliom, 2001). An additive model was used with the PTI to estimate potential toxicity of a water sample containing more than one pesticide. The toxicity was estimated by comparing stream concentrations to laboratory bioassay test endpoints such as the Lethal and Effect Concentrations for 50 percent of a test population (LC₅₀ and EC₅₀, respectively) for three taxonomic groups of aquatic organisms (fish, benthic invertebrates, and cladocerans [water flea]) (Munn and others, 2006). Although the PTI does not determine the actual toxicity of a sample, it can be used to estimate and rank the relative toxicity of samples containing one or more pesticides. The PTI value was computed for each sample by summing the toxicity quotients (the measured concentration of a pesticide in a stream sample divided by its median toxicity concentration from bioassay tests) for all pesticides detected in a sample. Some pesticide compounds had no toxicological data for some or all of the three taxonomic groups. To maximize the number of pesticides included in the PTI, a single overall PTI was calculated using the most sensitive or lowest median toxicity concentrations for fish, benthic macroinvertebrates, and cladocerans. Limitations of the PTI include:

Despite its limitations, the PTI proved to be a useful measure for assessing the potential cumulative effects of pesticide on aquatic ecosystems, and for examining the relative toxicity of pesticides that do not currently have aquatic-life benchmarks.

Explanatory Environmental Variables

Five distinct environmental data sets were compiled, which included hydrology, water temperature, stream habitat, water chemistry, and watershed characteristics. These data sets were used to explain differences in algae, benthic macroinvertebrates, and fish at the 28 sites. Generally, the number of variables in each data set greatly exceeded the number of sampling sites (n = 28); therefore, a large number of variables in each data set had to be eliminated with the remaining variables transformed and standardized to meet important statistical assumptions of normality and homogeneity of variance (Legendre and Legendre, 1998; Clarke and Gorley, 2006). The number of variables in each data set was reduced by analyzing the correlation matrix and scatter plots to eliminate strongly correlated, redundant variables. Appendixes in Sprague and others (2006) provide a complete list of all environmental variables sampled and used in our initial analyses.

After preliminary analysis with each environmental data set it became apparent that even with strong data transformations such as log(X + 1), each data set still contained extreme values that dominated and skewed the distributions and results. Subsequently, all analyses used rank transformations to eliminate the influence of these extreme values. The multivariate BIO‑ENV (biology-environment relationship using the BEST statistics routine in PRIMER) procedure in PRIMER, version 6 (Clarke and Ainsworth, 1993; Clarke and Gorley, 2006) was used to identify a subset of 5 to 10 variables in each data set that best explained the measured variation among the 28 sites from the initial larger number of variables in that data set. This procedure was completed separately for each data set, and then the final or “best of the explanatory environmental variables” were merged. The variables retained for each data set and detailed descriptions are listed in tables A3 to A8.

In addition, a nutrient index was created by taking the first principle component (first axis) of a principle component analysis (PCA) on just the variables TN and TP, then converting the resulting axis scores to a scale from 0 to 100. The first PCA axis of TN and TP explains the largest amount of variation across the 28 sites and therefore so does the nutrient index.

The algal data was analyzed using multivariate statistics and algal water-chemistry metrics were calculated using Algal Data Analysis System (ADAS) software (Tom Cuffney, U.S. Geological Survey, written commun., 2007) that interfaced with an autecological compilation of water-quality indicator traits for more than 6,000 algal taxa (Porter, 2008). ADAS also created the diatom-only taxa-by-site data matrix used for the PRIMER analyses. Nearly all sites contained a high percentage of relatively small nondiatom taxa (mostly blue-green and red algae) that overshadowed the signal from the diatoms. Much information is available on the tolerances and preferences of diatoms for several water chemistry parameters including nutrients, specific conductance, dissolved oxygen (DO), pH, temperature, amount of organic matter, and current velocity. Therefore, multivariate and algal metric analyses were performed only on diatom data, and all-taxa datasets were characterized by relative density (number of cells/cm²).

The Invertebrate Data Analysis System software (IDAS; Cuffney, 2003) was used to resolve taxa ambiguities for invertebrate data and to calculate about 140 benthic macroinvertebrate metrics commonly used in bioassessment (Davis and Simon, 1995; Barbour and others, 1999). Cuffney (2003) and Cuffney and others (2005) describe and discuss the issues of benthic macroinvertebrate taxa ambiguities that are beyond the scope of this report. The benthic macroinvertebrate metrics included measures of richness, percentage richness, density, percentage density, dominance, organism tolerance, and assemblage diversity. The tolerance metrics reported were based on the combination of regional tolerance values for the Pacific Northwest (B. Wisseman, Aquatic Biology Associates, Inc., written commun., 2003) or on professional judgment for taxa not covered in the Wisseman regional list. All tolerance values assigned to taxa followed the standard U.S. Environmental Protection Agency (USEPA) tolerance scoring of 0 to 10 from least to most tolerant (Barbour and others, 1999, Cuffney, 2003). Tolerance values then were compared to national and Pacific Northwest regional values reported by Cuffney (2003) to assure consistency and appropriateness. Tolerance metrics were calculated based on richness and abundance (Cuffney, 2003).

All invasive-fish counts were summed to create an aggregated nonnative “pseudospecies” to substitute for the individual nonnative counts (10) due to their limited individual occurrences among the sites. A fish index that summed the scores from four individual metrics also was computed: percentages of salmonids, reticulate sculpins, nonnative species and natives with reticulate and salmonids removed. Site values were given scores of 8, 4, 2, or 1 if the value fell within different quartiles (less than 25 percent, 26 to 50 percent, 51 to 75 percent, or greater than 75 percent). Scores from the four metrics then were summed and converted to a 0 to 100 scale “fish index,” with higher values indicating a more natural fish assemblage. The two native ammocoete lamprey (Lampetra) species were combined into the single category.

Biological data matrices commonly have numerous zero values and a few extreme values, resulting in a highly skewed distribution that requires some form of transformation to bring it closer to a normal distribution before statistical analyses can be completed (Legendre and Legendre, 1998; Clarke and Gorley, 2006). For all multivariate analyses, diatom density data was transformed using the square root function and benthic macroinvertebrate counts were converted to abundance values in number per square meter and log transformed (X + 1). The abundance data for fish species were log transformed (X + 1) to create the site-species matrix for multivariate analysis.

Relating Biological Assemblages to Environmental Factors

Associations between the environmental and biological data (algae, benthic macroinvertebrates, and fish assemblages) were examined using Spearman rank correlations (SAS version 8: Delwiche and Slaughter, 1998) and PRIMER multivariate statistical analyses (ordinations). Nonmetric dimensional scaling (nMDS) ordinations of the full assemblage data (for fish and benthic invertebrates) or the diatoms-only assemblage (for algae) were generated using Bray-Curtis similarity matrices for each biological assemblage (PRIMER, version 6: Clarke and Gorley, 2006). This method reduces the complex multidimensional nature of ecological data (for example, multiple species across many sites) to a reduced set of axes (1–4) that attempts to capture as much strength and explained variation among sites as the original multidimensional data matrix (for more detailed information on multivariate ordinations see Legendre and Legendre 1998). The result is a 2-axis plot where samples (sites) are positioned according to degree of similarity in taxonomic composition with each other. The goal is to reduce the complex multivariate species data to two ordination axes, which then may be correlated with environmental factors that may influence the species composition. In addition, the environmental matrix (Euclidian distance similarity) was related directly to the ecological matrices using the BEST procedure in PRIMER to determine the final subset of the environmental variables that best describe the variation in the ecological species matrix (nMDS ordination) among the 28 sites.

In this report, Spearman rank correlation coefficients (rho values) were considered strong when greater than or equal to 0.66 and moderate when between 0.66 and 0.50. All rho values greater than 0.50 were statistically significant at P less than 0.05. The different analytical techniques used, such as scatter plots, summary graphs, correlations, and multivariate analyses, although common and robust, do not prove direct cause and effect. They are useful, however, for providing insights into ecological processes, for revealing potential environmental pathways, and for generating hypotheses.

Accessibility FOIA Privacy Policies and Notices

U.S. Department of the Interior | U.S. Geological Survey
URL: https://pubs.usgs.gov/sir/2006/5101-D
Page Contact Information: Publications Team
Page Last Modified: Thursday, 01-Dec-2016 19:24:17 EST