Data Series 279
National Water-Quality Assessment Program
Geographic information system (GIS) analysis was used to characterize the natural and anthropogenic basin characteristics for each site. Basin boundaries were derived from USGS 30-m national elevation data (U.S. Geological Survey, 2005a). The boundaries then were overlain with mapped data representing natural features and human activities and values. Most variables were derived based on characteristics for the entire basin; however, several categories of variables were calculated on finer scales, such as within riparian zones or stream segments. Streams were mapped from the USGS National Hydrography Dataset (NHD) at the 1:100,000 scale (U.S. Geological Survey 2005b). GIS-derived variables are presented in the broad categories of natural environmental setting (ecoregions, soils, topography, climate), land cover, landscape pattern, population and housing, infrastructure, and stream segment.
The natural setting of each basin was characterized by using U.S. Environmental Protection Agency (USEPA) level IV ecoregions (Griffith and others, 2002) and USGS hydrologic landscape regions (U.S. Geological Survey, 2003). Soil properties, such as texture and drainage, were derived from the Natural Resources Conservation Service State Soil Geographic (STATSGO) database (U.S. Department of Agriculture, 1994), and topographic characteristics, such as basin relief and mean basin slope, were derived from USGS 30-m national elevation data (U.S. Geological Survey, 2005a). Basin-level mean air temperature and precipitation statistics were derived from 1-kilometer resolution Daymet model data (Daymet, 2005), which represented 18 years (1980–97) of mean temperature and precipitation data obtained from terrain-adjusted daily climatological observations.
Land-cover data were derived from the National Land Cover 1992 (NLCD92) and 2001 (NLCD01) datasets (U.S. Geological Survey, 2005c). The NLCD01 is a 16-class, 30-m resolution dataset based primarily on Landsat-7 enhanced thematic mapper data for the period 1999–2002, which represents a composite for the approximate 2001 timeframe. In addition to the 16-class dataset, land-cover data were aggregated to eight Level I classes (for example, deciduous forest, evergreen forest, and mixed forest were aggregated to “forest;” Anderson and others, 1976). The NLCD01 also contains a subpixel percent impervious-surface data layer. An internal accuracy assessment found a general underestimation of impervious surface (mean difference from ground truth = –13.4 percent) using the NLCD01 data (James Falcone, USGS, written commun., February 2006). Land-cover variables were calculated for each study basin and stream riparian zone, based on NHD stream lines for the entire basin. The riparian zone was defined as the area extending approximately 100 m on each side of the stream centerline.
Landscape pattern metrics characterizing the shape, size, and spatial configuration of land-cover patches were derived by using the FRAGSTATS software package (McGarigal and Marks, 1995). Basin land-cover data were reclassified to Level I classifications (water, urban, forest), and then FRAGSTATS metrics were calculated for patches of each class type. An additional metric (Basin Shape Index) was calculated based on the entire basin boundary.
Basin population and population density were calculated based on 2000 Census block-level data (GeoLytics, Inc., 2004). All other census variables (demographic, labor, income, and housing characteristics) were calculated based on 2000 Census block-group data. Four socioeconomic indexes (SEI) were additionally derived based on principal component ordination of 65 census variables, as described in McMahon and Cuffney (2000). The ordination extracts the primary sources of variability among census block groups, such that the first axis of the ordination (represented by SEI-1) describes the principal ways that the block groups can be distinguished. Subsequent axes describe the next most important ways that the data are structured. Area weighting was used to apportion values for these indices and the associated 65 variables from the block groups to each of the study basins. Variable weights for each axis are provided in the descriptive portion of the basin characterization (census) data file. Infrastructure data were based on Census 2000 TIGER roads (GeoLytics, Inc., 2004), point-source dischargers from USEPA National Pollutant Discharge Elimination System (NPDES) locations (U.S. Environmental Protection Agency, 2005a), and Toxic Release Inventory locations ( U.S. Environmental Protection Agency, 2005b).
To examine stream conditions close to the sampled reach, stream segments were identified. These segment lengths were defined as a function of drainage area (log10 × drainage area × 1,000) and were located starting at the study site and proceeding upstream. NLCD01 land-cover statistics were derived for the riparian zone (~100 m each side of the stream centerline) of the stream segment. Stream-segment statistics were calculated for physical characteristics that were not related to land cover: sinuosity, gradient (based primarily on 30-m National Elevation Dataset (NED) data), mean distance to the nearest road, and density of road/stream intersections on the length of the segment.