USGS
home page

Methods of Rating Unsaturated Zone and Watershed Characteristics of Public Water Supplies in North Carolina

Water-Resources Investigations Report 99-4283
By Jo Leslie Eimers, J. Curtis Weaver, Silvia Terziotti, and Robert W. Midgette


APPENDIX: Documentation on Geographic Information System (GIS) Data Sources

Land Use and Land Cover
Elevation and Slope
Hypsography pre-processing
Hydrography pre-processing
Shoreline pre-processing
Soils
Precipitation
Ground-Water Contribution

Land Use and Land Cover

The source data for both the land-use and land-cover components is the Multi-Resolution Land Characteristics (MRLC) data set. The MRLC data set is a product of the MRLC Consortium which consists of the Ecological Monitoring and Assessment Program of the U.S. Environmental Protection Agency (USEPA), U.S. Forest Service Remote Sensing Application Center, Gap Analysis Program of the U.S. Geological Survey (USGS) Biological Resources Division, Coastal Change Analysis Program of the National Oceanic and Atmospheric Administration (NOAA), and the National Water-Quality Assessment (NAWQA) Program and EROS Data Center (EDC) of the USGS. The mechanisms for collaboration were formalized with the signing of a Memorandum of Understanding in 1995.

The main objective of the MRLC consortium was to generate a generalized and consistent (seamless) land-cover data layer for the entire conterminous United States (Bara, 1994). The North Carolina portion of the data set was created as part of the land-cover mapping activities for Federal Region IV (the States of Kentucky, Tennessee, Mississippi, Alabama, North Carolina, South Carolina, Georgia, and Florida). The development of the Region IV data set was initiated during the spring of 1997, and a first draft product was completed during summer 1997. This data set was developed by personnel at the EDC, Sioux Falls, S.D.

The primary source of data for the MRLC data set was leaves-off (primarily spring) Landsat Thematic Mapper (TM) data, acquired during 1990-93, primarily during the spring seasons of 1991, 1992, and 1993 (Vogelmann and others, 1998). Additionally, leaves-on (summer) TM data sets were acquired and referenced. In total, 24 TM scenes were analyzed. These data sets were referenced to Albers Conical Equal Area coordinates, but projected to the North Carolina State Plane coordinate system for this project.

The general procedure used for processing the MRLC data set was to (1) mosaic multiple leaves-off TM scenes and classify them by using an unsupervised classification algorithm, (2) interpret and label classes into land-cover categories by using aerial photographs as reference data, (3) resolve confused classes by using the appropriate ancillary data source(s), and (4) incorporate land-cover information from leaves-on TM data, National Wetlands Inventory (NWI) data, and other data sources to refine and augment the "basic" classification developed above. More detailed information about the background and production process of the MRLC data sets can be obtained at the USEPA web site http://www.epa.gov/mrlc.

To test the methods, the MRLC data set was resampled from a 30-meter by 30-meter grid to a 60-meter by 60-meter grid, shifted, and snapped to match the lower-left corner of the other contributing-factor data sets. Since the MRLC is categorical data, a nearest neighbor algorithm was used to maintain the classification scheme. This resampling technique assigns a value to each cell in the coarser data set that is the value of the cell in the original source closest to the center of the larger cell. Resampling by using the nearest neighbor algorithm from a 30-meter by 30-meter to a 60-meter by 60-meter grid implies that the value from only 1 cell out of each 4-cell neighborhood will be represented in the output data set. However, the overall representation of land-cover and land-use classes is the same-the percentage of area within each land-use and land-cover class is the same statewide for the 30-meter by 30-meter and 60-meter by 60-meter data sets.

When rating methods are applied statewide, cell size will be retained at 30 meters by 30 meters. A 1999 release of the MRLC land-use and land-cover data (still the 1990-93 data, but with better distinctions among some categories) should be used.

Elevation and Slope

The source of the elevation data set used to derive the slope components is a Digital Elevation Model (DEM) developed by the USGS and North Carolina State University. The ARC/INFO version 7.1.1 TOPOGRID command was used to process the elevation surface model. TOPOGRID incorporates the software package, developed by Michael Hutchinson at Australian National University, known as "ANUDEM" (abbreviated form of Australian National University Digital Elevation Model) to produce the DEM. Four types of input data were used for the production of the DEM-hypsography (land-surface elevation) contour lines, hypsography points, hydrography, and shoreline. Following ANUDEM processing, a "fill" procedure (Jenson and Domingue, 1988) was used to remove remaining depressions. Each of the pre-processing steps is described briefly below.

Hypsography pre-processing:

The USGS 1:100,000-scale digital line graph (DLG) hypsography files were downloaded from the USGS GeoData web site (http://edcwww.cr.usgs.gov/doc/edchome/ndcdb/ndcdb.html). The DLG files were converted into a point GIS layer and a contour line GIS layer. Only the elevation and depression contours were used.

Hydrography pre-processing:

The 1:100,000-scale hydrography data are an early release of the River-Reach File (RF-3) distributed by the USEPA. Cataloging units that include any part of North Carolina were processed. Several changes were made in the RF-3 data set before its use in TOPOGRID. First, many small water bodies and streams that were not connected to the main stream network were eliminated. Larger unconnected streams were retained. Second, centerlines were generated for all large lakes, wide streams, and other water bodies. The polygons forming the water bodies were removed. Because the stream centerlines were used in the creation of the DEM rather than water-body polygons, the DEM is not flat in the areas covered by water. Third, TOPOGRID requires that all streams point downstream, so all lines pointing upstream were flipped. Finally, the RF-3 data were incomplete in several places. Large parts of several rivers and lakes were missing. Corrections were made by using data extracted from the USGS 1:100,000-scale hydrography DLG.

Shoreline pre-processing:

The shoreline of North Carolina at 1:24,000 scale was combined with the shoreline of adjacent states at 1:70,000 scale. The shoreline data were processed into a GIS data layer that defined water and land. The shoreline arc also was entered as a contour elevation with a zero-meters elevation value. Shoreline areas were examined to verify that no overlap of contour lines with shorelines occurred.

Once the entered data sets were finalized, the data were processed through the TOPOGRID function in half-degree by 1-degree geographic blocks. A 6-kilometer area of overlap was included around each block to minimize edge effects. The blocks were then mosaiced together to create a seamless DEM for the State. The final resolution of the DEM tested in several areas of the State is 60-meter by 60-meter cells, stored as a floating-point, raster grid. A percentage slope data layer was derived by using the SLOPE function in ARC/INFO GRID module.

When rating methods are applied statewide, GIS grid-cell size will be reduced to 30 meters by 30 meters. Improved slope data (based on 1:24,000 DEM data base) will be used.

Soils

Two sources of soil data were used for this report-county level and state level. Soil types by county were identified in the SSURGO data base of the Natural Resources Conservation Service (NRCS). The NRCS developed the SSURGO data base at a scale of 1:24,000, primarily for use in the natural-resource planning and management of farms and ranches, townships, or counties and by landowners/users. At the time of this report, county-level data have been processed for Alamance, Beaufort, Brunswick, Cabarrus, Currituck, Durham, Edgecombe, Granville, Guilford, Halifax, Hyde, Mecklenburg, Nash, Orange, and Stanly Counties in North Carolina.

Where county-level soil information was not available, the STATSGO data base for North Carolina was used. STATSGO is a digital, general-soils association map developed by the NRCS. It consists of a broad inventory of soil and non-soil areas that occur in a repeatable pattern on the landscape and that can be cartographically shown at the scale mapped. The soil maps for STATSGO are compiled by generalizing more detailed soil survey maps. Where more detailed soil survey maps are not available, data on geology, topography, vegetation, and climate are assembled, together with Land Remote Sensing Satellite (LANDSAT) images. Soils of like areas are studied, and the probable classification and extent of the soils are determined. STATSGO maps are at the 1:250,000 scale and are designed primarily for regional, multicounty, river basin, State, and multistate resource planning, management, and monitoring.

To test the rating methods, the STATSGO and SSURGO soil layers were compiled into one layer with a cell size of 60 meters by 60 meters. The SSURGO data were superimposed on the STATSGO data so that the best available data are always used. When rating methods are applied statewide, a cell size of 30 meters by 30 meters will be used.

Information about soil permeability and thickness was obtained from the Map Unit Interpretation Record (MUIR) attribute data base that is linked to the SSURGO soil-unit delineation and the STATSGO mapping unit. MUIR contains information about soils and individual layers within soils. Some problems were encountered with the attribute data for the SSURGO data. Certain soil series were not assigned permeability or thickness values, including dams, gullied lands, pits, mines, quarries, stony lands, udorthents, urban lands, dunes, and water. For statewide evaluation, missing values should be assigned the STATSGO value for the area.

ARC/INFO programs were written to process the MUIR data to extract thickness and permeability by layer for each soil unit. For SSURGO and STATSGO data, the weighted average by percentage of each soil component was applied to each mapping unit for thickness and harmonic mean permeability. The body of the report defines the equations used to calculate the harmonic mean permeability values.

More information on STATSGO, SSURGO, and the MUIR data bases can be obtained from the U.S. Department of Agriculture, NRCS, National Soil Survey Center, National Soil Data Access Facility web site, http://www.statlab.iastate.edu/soils/nsdaf/.

Precipitation

The mean monthly precipitation estimates were generated by the Parameter-Elevation Regressions on Independent Slopes Model (PRISM) (Daly and others, 1994, 1997). PRISM is an analytical tool that uses point data, a DEM, and other spatial data sets to generate estimates of monthly, yearly, and event-based climatic parameters, such as precipitation, temperature, snowfall, degree days, and dew point. PRISM-derived data sets have been used in applications of climatology, hydrology, natural resources, global climate change, land use, planning, relocation, education, and geography. PRISM is uniquely designed to map climate in the most difficult situations, including high mountains, rain shadows, temperature inversions, coastal regions, and other complex climatic regimes.

PRISM uses a DEM to estimate the elevations of precipitation stations at the proper orographic scale, and uses the DEM and a windowing technique to group stations onto individual topographic facets. For each DEM grid cell, PRISM develops a weighted precipitation/elevation (P/E) regression function from nearby stations, and predicts precipitation at the cell's DEM elevation with this function. In the regression, greater weight is given to stations with location, elevation, and topographic positioning similar to that of the grid cell. Whenever possible, PRISM calculates a prediction interval for the estimate, which is an approximation of the uncertainty involved. By relying on many localized, facet-specific P/E relations rather than a single domain-wide relation, PRISM continually adjusts its frame of reference to accommodate local and regional changes in orographic regime with minimal loss of predictive capability.

Data entry into the national model consisted of 1961-90 mean monthly precipitation data from more than 8,000 NOAA cooperative sites, snow telemetry (SNOTEL) sites, and selected State network stations. Data-sparse areas were supplemented by a total of about 500 short-term stations. A station was included in this data set if it had at least 20 years of valid data, regardless of its period of record. PRISM software was used to minimize "seams" along State and regional boundaries. The North Carolina portion of the data set is distributed separately.

The DEM data for the model are 1:250,000-scale, distributed by the USGS. These data and their associated metadata are available from the USGS web site http://edcwww.cr.usgs.gov/doc/edchome/ndcdb/ndcdb.html.

Summing 12 monthly maps for the country created the national mean annual precipitation maps. The annual maps underwent extensive peer review by many State climatologists and other experts. This is part of a national effort by the NRCS and Oregon State University to develop state-of-the-art precipitation maps for each State in the United States.

Precipitation estimated for each grid cell is an average over the entire area of that cell; thus, point precipitation can be estimated at a spatial precision no better than half the resolution of a cell. For example, the precipitation data were distributed at a resolution of approximately 4 kilometers (km). Therefore, point precipitation can be estimated at a spatial precision no better than 2 km. However, the overall distribution of precipitation features is thought to be accurate. For further information, the online PRISM homepage can be accessed at the Oregon State University's "Climate Mapping with PRISM" web site-http://www.ocs.orst.edu/prism/prism_new.html.

Ground-Water Contribution

The ground-water contribution component was derived from applying the unsaturated zone ratings within a 305-meter area around all streams identified in the early release of the RF-3 distributed by the USEPA.

First, 305-meter polygons were drawn around all streams by using the ARC/INFO "BUFFER" command. The streams were processed within 8-digit hydrologic cataloging unit areas. Occasionally, the buffered areas extended over the ridgelines defined by the cataloging units. The areas that extended over the cataloging unit boundaries were removed.

Next, the buffered streams were adjusted to include water bodies that overlapped the buffered stream areas. This ensured that the middle of lakes wider than 305 meters were included in the analysis of ground-water contribution.

Finally, the unsaturated-zone component of inherent vulnerability was applied to the buffered zones within the six pilot sites by using overlay analysis. A raster layer with 60-meter by 60-meter cells was created for testing the rating methods in several areas of the State; however, 30-meter by 30-meter cells will be used when the method is applied statewide.


Abstract || Introduction || Methods || Limitations || Summary || References || Appendix

Return to the

WRI 99-4283 Home Page

North Carolina District Home Page


Last modified: Tue Nov 7 13:52:13 EST 2000